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class DataAccess<PENV> /* PENV is a formal domain parameter */ { 
domain STATE; /* STATE is a private domain */ 
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APPARATUSES AND METHODS FOR 
PRODUCING RUNTIME ARCHITECTURES 
OF COMPUTER PROGRAM MODULES 

CROSS-REFERENCE TO RELATED 5 

APPLICATIONS 

This application claims priority from U.S. Provisional 
Patent Application No. 60/937,704, filed Jim. 29, 2007, which 
is incorporated herein by reference. 10 

STATEMENT REGARDING 
FEDERALLY-SPONSORED RESEARCH AND 
DEVELOPMENT 

15 

This invention was made, at least in part, with government 
support under NSA number MDA904-03-C-1 107, NASA 
number NNA05CS30A; NSF number CCF-0546550; and 
DARPA number HR001 10710019. The United States gov- 20 
emment may have certain rights in this invention. 

FIELD OF THE INVENTION 

The present invention is directed generally to apparatuses 25 
and methods for producing run-time architectures of com- 
puter program modules. 

BACKGROUND OF THE INVENTION 

30 

1 Introduction 

Many architectural views are needed to describe a software 
system. The code architecture or module view organizes code 
entities in terms of classes, packages, layers and modules, and 35 
is useful for studying properties such as maintainability; 
another useful view is the runtime architecture of a system [P. 
Clements et al. Documenting Software Architecture. Addi- 
son- Wesley, 2003]. A runtime architecture, also known as a 
Component-and-Connector (C&C) view, models runtime 40 
entities and their potential interactions. A related notion is 
that of an object diagram, a diagram of object structures 
which shows object instances exclusively. Gamma et al. use 
object diagrams extensively to explain the Gang of Four 
design patterns [E. Gamma, R. Helm, R. Johnson, and J. 45 
Vlissides. Design Patterns: Elements of Reusable Object- 
Oriented Software. Addison- Wesley, 1994.]. 

A runtime component is a unit of computation and state 
that has a runtime presence. Thus, in an object-oriented sys- 
tem, a component is an object or a group of objects [P. Clem- 50 
ents et al. Documenting Software Architecture . Addison- Wes- 
ley, 2003.]. A connector is an abstraction of a runtime 
interaction. In an object-oriented system, a connector models 
one or more object relations. Architectures often organize 
components into tiers. A tier is a conceptual partitioning of 55 
functionality. A layer denotes a partition in the code architec- 
ture or a module view [P. Clements et al. Documenting Soft- 
ware Architecture. Addison- Wesley, 2003]; it can be repre- 
sented as a package and enforced using dependency rules [N. 
Sangal, E. Jordan, V. Sinlia, and D. Jackson. Using Depen- 60 
dency Models to Manage Complex Software Architecture. In 
OOPSLA, 2005.]. A tier denotes a partition in a runtime view 
[P. Clements et si. Documenting Software Architecture. Addi- 
son- Wesley, 2003.]. Finally, architectures are often hierarchi- 
cal whereby a component can have a nested sub-architecture 65 
consisting of lower-level components and comiectors [N. 
Medvidovic and R. N. Taylor. A Classification and Compari- 


son Framework for Software Architecture Description Lan- 
guages. IEEE TSE, 26(1), 2000.]. 

While the above definitions are consistent with formal 
Architecture Description Languages (ADLs) and software 
architecture research, developers intuitively draw such runt- 
ime architectures on whiteboards. FIG. 1 is a runtime archi- 
tecture drawn by the developer of a system we study in Sec- 
tion 5.3. The runtime architecture illustrated in FIG. 1 is an 
8,000-line subject system, Aphyds, and the developer who 
made the illustration in FIG. 1 is an experienced programmer 
with a Ph.D. in computer science but no formal training in 
software architecture. The drawing follows the Model-View 
design pattern with the user interface above the line in the 
middle of the diagram and the circuit and computational code 
below the line. Notice node and net inside circuit’s sub- 
architecture. The unlabeled arrows (including the thick 
arrows) represent data flow while the arrows labeled call 
represent control flow. 

Architectural -level analyses for properties such as perfor- 
mance, reliability or security require runtime views. More- 
over, having an up-to-date as-built runtime architecture 
enables checking the conformance of a system with its as- 
designed architecture. Despite receiving much research atten- 
tion, architectural extraction remains a hard problem. 

Recovering meaningful runtime architectures statically is 
hard for object-oriented systems since their runtime architec- 
ture often bears little resemblance to their code architecture. 
In fact, most recovery approaches employ a mix of static and 
dynamic information. To simplify the problem of relating 
architecture to code, previous research mandated specific 
implementation frameworks [N. Medvidovic and R. N. Tay- 
lor. A Classification and Comparison Framework for Soft- 
ware Architecture Description Languages. IEEE TSE, 26(1), 
2000.] or extended the language to specify a component-and- 
connector architecture directly in code [J. Aldrich, C. Cham- 
bers, and D. Nothin. ArchJava: Connecting Software Archi- 
tecture to Implementation. In ICSE, 2002.], [J. Schafer, M. 
Reitz, J.-M. Gaillourdet, and A. Poetzsch-Heffter. Linking 
Programs to Architectures: An Object-Oriented Hierarchical 
Software Model based on Boxes. In The Common Compo- 
nent Modeling Example: Comparing Software Component 
Models, 2008.] . Such proposals require re-engineering exist- 
ing implementations. Embodiments of the present invention 
will support existing object-oriented languages, design idi- 
oms and existing libraries and frameworks. 

Intuitively, many have preferred dynamic analyses to 
extract the as-built runtime architecture. Such an analysis 
monitors one or more program runs and shows snapshots of 
the system’s runtime architecture for those runs [M. Sefika, 
A. Sane, and R. Campbell. Architecture Oriented Visualiza- 
tion. In OOPSLA, 1996.], [C. Flanagan and S. N. Freund. 
Dynamic Architecture Extraction. In FLoC FATES-RV, 
2006.], [B. Schmerl, J. Aldrich, D. Garlan, R. Kazman, andH. 
Yan. Discovering Architectures from Running Systems. 
IEEE TSE, 32(7), 2006.]. But these descriptions are partial 
and cover only a few representative interactions between 
objects, based on particular inputs and exercised use cases. A 
true architecture is meant to capture a complete description of 
the system’s runtime structure. To meet this goal, a static 
analysis is preferred. 

A static analysis must also be sound and not fail to reveal 
entities and relationships that actually exist at runtime. For 
instance, an architectural-level security analysis requires a 
complete architectural description to handle the worst, not the 
typical, case of runtime component communication. 

Accordingly, many existing object-oriented systems would 
benefit from having an up-to-date documented runtime archi- 
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tecture, which would enable several software assurance tech- 
niques. It is hard, with the prior art, to statically extract a 
sound runtime architecture from an object-oriented system 
written in an existing programming language and that follows 
common design idioms, so many have preferred dynamic 
analyses. 

BRIEF SUMMARY OF THE INVENTION 

The present invention is directed generally to apparatuses 
and methods for producing run-time architectures of com- 
puter program modules. In particular, ownership domain 
annotations specify in code some architectural intent related 
to object encapsulation and communication; the present 
invention shows they also enable the static extraction of a 
sound runtime archi tecture. The present invention also 
describes the core analysis first informally, then formally, and 
proves key soundness theorems. It is also shown how the 
present invention improves the precision of the analysis based 
on knowledge gained from applying the method on real 
object-oriented code. Furthermore, the present invention 
evaluates the analysis on several extended examples of 
medium-sized representative programs that we annotated 
manually. Those and other advantages of the present inven- 
tion will be described herein. 

According to one embodiment, the present invention 
includes a method for producing a representation of a true 
runtime object graph from a computer program module. In 
general, the method includes creating an abstract graph and 
creating a runtime graph from the abstract graph. More spe- 
cifically, one embodiment of the method includes creating an 
abstract graph from the computer program module and from 
containment information corresponding to the computer pro- 
gram module, wherein the abstract graph has nodes including 
types and objects, and wherein the abstract graph relates an 
object to a type, and wherein for a specific object the abstract 
graph relates the specific object to a type containing the 
specific object. The method also includes creating a runtime 
graph from the abstract graph, wherein the runtime graph is a 
representation of the true runtime object graph, wherein the 
runtime graph represents containment information such that, 
for a specific object, the runtime graph relates the specific 
object to another object that contains the specific object. 

Many variations and modifications are possible with the 
present invention. For example, The present invention also 
includes apparatuses. For example, the present invention may 
be embodied as a computer, a plurality of computers or a 
computer system, computer-readable instructions, or other 
variations thereof. In one embodiment, the present invention 
is an apparatus including a processor and memory. The 
memory includes computer readable instructions which, 
when executed by the processor, cause the processor to create 
an abstract graph from the computer program module and 
from containment information corresponding to the com- 
puter program module, wherein the abstract graph has nodes 
including types and objects, and wherein the abstract graph 
relates an object to a type, and wherein for a specific object the 
abstract graph relates the specific object to a type containing 
the specific object. The computer readable instructions also 
cause the processor to create a runtime graph from the 
abstract graph, wherein the runtime graph is a representation 
of the true runtime object graph, wherein the runtime graph 
represents conta inm ent information such that, for a specific 
object, the runtime graph relates the specific object to another 
object that contains the specific object. 

In another embodiment, the present invention is embodied 
as computer readable instructions (apart from any computer) 
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which, when executed, cause a processor to perform the steps, 
or variations of the steps, described above. 

Many variations are possible with the present invention, 
and these and other teachings, variations, and advantages of 
5 the present invention will become apparent from the follow- 
ing detailed description of the invention. 

BRIEF DESCRIPTION OF THE SEVERAL 
VIEWS OF THE DRAWING 
to 

Embodiments of the present invention will now be 
described, by way of example only, with reference to the 
accompanying drawings for the purpose of illustrating the 
embodiments, and not for purposes of limiting the invention, 
wherein: 

FIG. 1 illustrates a runtime architecture of an 8,000-line 
system. 

FIG. 2 illustrates a thumbnail of the JHotDraw object graph 
2 Q obtained at compile-time by Womble. 

FIG. 3 a illustrates one embodiment of ownership domains. 

FIG. 3 b illustrates the legend used to represent ownership 
domains. 

FIG. 4a illustrates an example of DataAccess code with 
25 annotations. 

FIG. 4b illustrates an abstract graph for the DataAccess 
example. 

FIG. 5a illustrates one embodiment of an abstract graph. 

FIG. 5b illustrates one embodiment of a runtime graph. 

30 FIG. 5c illustrates one embodiment of a display graph. 

FIG. 5 d illustrates an abstract graph for the QuadTree 
example. 

FIG. 5e illustrates a runtime graph for the QuadTree 
example. 

FIG. 5 f illustrates a display graph for the QuadTree 
example. 

FIG. 6 a illustrates a partial runtime graph for the DataAc- 
cess example without merging and without pulling. 

40 FIG. 6b illustrates a partial runtime graph for the DataAc- 
cess example with merging but without pulling. 

FIG. 6c illustrates a partial runtime graph for the DataAc- 
cess example with merging and with pulling. 

FIG. 6 d illustrates the complete OOG for the DataAccess 
45 example. 

FIG. la illustrates an output of a dynamic analysis on an 
IntList example. 

FIG. lb illustrates object summarization on an IntList 
example. 

50 FIG. 1c illustrates an OOG for an IntList example. 

FIG. Id illustrates summary edges for an IntList example. 

FIGS. 8a-8c illustrate data type declarations and rewriting 
rules to convert an AbstractGraph into a RimtimeGraph. 

FIG. 9 illustrates the pulling rule. 

55 FIG. 10 illustrates the DataAccess OOG with formal 
domains. 

FIGS. 1 la-1 lc illustrate the rewriting rules on the DataAc- 
cess example. 

FIG. 12 illustrates the partial store typing rule. 

60 FIG. 13 illustrates the visitor to generate the abstract graph. 

FIG. 14 illustrates the JHotDraw OOG using declarations. 

FIGS. 15a and 15e-15/z illustrate illustrates the JHotDraw 
OOG based on object instantiations but without type abstrac- 
tion. 

65 FIG. 156 illustrates the JHotDraw OOG based on object 
instantiations and type abstraction using the default trivial 
types. 
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FIG. 15c illustrates the JHotDraw OOG based on object 
instantiations and type abstraction using more precise trivial 
types from a user-specified list. 

FIG. 15d illustrates the JHotDraw OOG based on object 
instantiations and type abstraction using user-specified 
design intent types. 

FIGS. 16a-16e illustrate a snapshot of the OOG tool. 

FIG. 17 illustrates JHotDraw class diagram. 

FIGS. 18o-18tf illustrate the top-level JHotDraw OOG. 

FIG. 19 illustrates the top-level HillClimber OOG. 

FIG. 20a illustrates the Aphyds OOG obtained without 
using public domains. 

FIG. 20 b illustrates the Aphyds OOG obtainedusing public 
domains. 

FIG. 21 illustrates one embodiment of a system according 
to the present invention. 

FIG. 22 illustrates one embodiment of a method according 
to the present invention. 

FIG. 23 illustrates the mapping to a C&C architecture. 

FIG. 24 illustrates one embodiment of a system according 
to the present invention. 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention relates generally to apparatuses and 
methods for producing runtime architectures of computer 
programs. The present invention may be, for example, par- 
ticular methods for producing results as described herein 
(including, but not limited to, computer software, for accom- 
plishing those results), and apparatuses for performing the 
invention such as one or more computers or computer sys- 
tems for performing the invention. The computers may, for 
example, execute software embodying the invention. 
Although the present invention will generally be described in 
terms of computer software, the present invention may also be 
embodied as, and may operate on, programs in forms other 
than computer software such as, for example, firmware, hard- 
ware, and other forms. 

In general, the present invention will be described in terms 
of a method for producing a hierarchical representation of a 
runtime object graph that corresponds to a computer program 
module. The method includes creating an abstract graph from 
the computer program module and from containment infor- 
mation corresponding to the computer program module, 
wherein the abstract graph has nodes including types and 
objects, and wherein the abstract graph relates an object to a 
type, and wherein for a specific object the abstract graph 
relates the specific object to a type containing the specific 
object. The method also includes creating a runtime graph 
from the abstract graph, wherein the runtime graph is a rep- 
resentation of the true runtime object graph, wherein the 
runtime graph represents containment information such that, 
for a specific object, the runtime graph relates the specific 
object to another object that contains the specific object. 
Although the present invention is described in terms of spe- 
cific examples and embodiments, the present invention is not 
limited to the particular details of these examples and 
embodiments. 

The present invention is directed to the static extraction of 
a runtime view of an object-oriented system with a two- 
pronged approach: a) assume that annotations encode and 
enforce the architectural intent in code [J. Aldrich and C. 
Chambers. Ownership Domains: Separating Aliasing Policy 
from Mechanism. In ECOOP, 2004.; M. Abi-Antoun and J. 
Aldrich. Ownership Domains in the Real World. In IWACO, 
2007.]; and b) leverage the annotations in a static analysis to 
extract a sound runtime architecture from an annotated pro- 


6 

gram. In the software architecture literature, a sound archi- 
tecture often means an architecture with desirable quality 
attributes. In the present invention, a sound architecture 
shows the actual system structure at runtime notwithstanding 
5 tight coupling between components or other undesirable 
quality attributes. Embodiments of the present invention may 
include one or more the following features: 

a. A static analysis for extracting an instance-based hierar- 
chical runtime architecture based on program annotations; 

10 b. A soundness proof of the extracted architecture; 

c. An evaluation of the analysis on several representative 
medium-sized object-oriented systems. 

Outline. 

15 The description of the present invention is organized as 
follows. We first discuss the requirements on a runtime archi- 
tecture (Section 2) and how the annotations help with archi- 
tectural extraction. In Section 3, we describe the core analysis 
both informally and formally, and prove a key soundness 
20 result. In Section 4, we improve the precision of the analysis 
based on knowledge gained from applying the method on real 
object-oriented code. Section 5 presents highlights of our 
evaluation on several real systems. We conclude with a dis- 
cussion (Section 6) and a survey of related work (Section 7). 
25 The present invention will sometimes be described in terms 

of “code”, “programs”, “software”, and similar terms. Those 
terms are understood in the art to mean “computer programs”, 
and those terms are used as such in this document. Similarly, 
the term “computer program module” is understood to mean 
30 part or all of a computer program, and that term is used as such 
in this document. For example, the present invention may be 
applied to a computer program module, which means that the 
present invention may be applied to an entire computer pro- 
gram, or the present invention may be applied to only a 
35 portion of a computer program, even if the entire computer 
program is present. 

2 Overall Strategy 

40 A runtime object graph represents a running object-ori- 
ented program where nodes correspond to runtime objects, 
and edges correspond to relations between objects. A sound 
runtime architecture must statically approximate all the runt- 
ime object graphs that any program run may generate. 

45 Existing static analyses that extract a system’s execution 
structure produce low-level, non-hierarchical object graphs 
that explain runtime interactions in detail but convey little 
architectural abstraction [R. W. O’Callahan. Generalized 
Aliasing as a Basis for Program Analysis Tools. PhD thesis, 
50 CMU, 2001], [D. Jackson and A. Waingold. Lightweight 
Extraction of Object Models from Bytecode. TSE, 27(2), 
2001], [A. Spiegel. Automatic Distribution of Object-Ori- 
ented Programs. PhD thesis, FU Berlin, 2002.]. FIG. 2 shows 
the output of a static object graph analysis, Womble [D. Jack- 
55 son and A. Waingold. Lightweight Extraction of Obj ect Mod- 
els from Bytecode. TSE, 27(2), 2001 .], on a 15,000-line pro- 
gram, JHotDraw (http://www.jhotdraw.oig (Version 5.3)). 
Low-level objects such Dimension and Rectangle appear at 
the same level as the root application object, JavaDrawApp. 
60 Such a view is a far cry from what a developer might draw for 
a runtime architecture (See FIG. 1). 

2.1 Requirements on Solution 

We listed earlier several requirements on a solution, 
namely that it be a static analysis and not require language 
65 extensions. Furthermore, an analysis for object-oriented code 
must also handle aliasing, recursion and inheritance. Some 
key properties of a runtime architecture include: 
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a. Component=Objects. 

An architecture must show components that correspond to 
runtime entities. For object-oriented systems, a component 
represents an object or a group of objects. A group of objects 
must be a meaningful abstraction, such as the circuit or viewer 
in FIG. 1. 

b. Connectoi=Object Relations. 

An architecture has connectors that correspond to relations 
between runtime entities. For object-oriented systems, a con- 
nector represents a runtime interaction between some object 
in one component and some object in another component. 

c. Tier=Group. 

An architecture often groups conceptually related compo- 
nents into runtime tiers or partitions. 

d. Hierarchy. 

A component can have a nested sub -architecture consisting 
of lower-level components and connectors. Hierarchy also 
provides abstraction since it enables both high-level under- 
standing and detail. 

e. Summarization. 

Different executions generate a different number of 
objects. Furthermore, the number of objects in the runtime 
object graph is unbounded. The architecture must be a finite 
representation of the runtime object graph. It is common 
practice to represent multiple objects at runtime with one 
canonical component. 

f. Scalability. 

Meaningful architectures would be most helpful for large 
systems. An architecture must scale, i.e., the size of top-level 
diagram should remain mostly constant as the program size 
increases arbitrarily. 

g. Aliasing. 

Ignoring aliasing may produce a misleading architecture. 
For instance, Womble sometimes shows multiple nodes for 
the same runtime object. In FIG. 2 , there are multiple Jav- 
aDrawApp nodes illustrated as thick-lined boxes. FIG. 2 also 
shows a separate DrawingEditor instance when it is the same 
object as the JavaDrawApp instance at runtime (Jav- 
aDrawApp extends DrawingEditor). If two components are 
shown as distinct when they are the same, an architecUiral 
analysis may assign them different values for a key trustLevel 
property. As a result, the validity of such an analysis may be 
suspect at best. Some object graph analyses do not ignore 
aliasing but use unscalable whole-program analyses [R. W. 
O’Callahan. Generalized Aliasing as a Basis for Program 
Analysis Tools. PhD thesis, CMU, 2001.]. 

h. Soundness. 

An architecture must be sound and represent all objects and 
relations between objects that may exist at runtime. We define 
soundness as: 

i. Component Soundness: 

An architecUire is sound if for every runtime object graph, 
there exists a map from objects to components, such that each 
runtime object o is mapped to exactly one component C in the 
architecUire, i.e., the same runtime object must not map to 
multiple components in the architecture. 

j. Connector Soundness: 

If there is a runtime connection between object o 1 and 
object o 2 in the runtime object graph, then there is a connector 
between components Cj and C 2 corresponding to Oj and o 2 . 

k. Tier Soundness: 

If object o is in a runtime tier d in the runtime object graph, 
then component C corresponding to o is in tier D in the 
architecUire. 

l. Precision. 

An architecUire is precise if it shows two runtime entities 
that represent different concepUial design elements as two 
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different architecUiral entities. An architecture is imprecise if 
its elements are too coarse grained and lump together runtime 
elements that serve different conceptual purposes in the 
design. For instance, an architecUire with one component that 
5 represents the entire system is sound but imprecise. This 
definition of precision can be refined as: 

m. Component Precision: 

The architecUire shows two runtime entities that represent 
two different conceptual design elements as two different 
to components. 

n. Connector Precision: 

The architecUire shows two runtime relations that repre- 
sent two different conceptual interactions as two different 
connectors. 

15 Although these properties are important and one might 
desire to achieve all such properties in a particular embodi- 
ment of the present invention, it is not required that the 
present invention include all, or even most, of these proper- 
ties. On the contrary, advantages of the present invention can 
20 be realized with less than all of these properties, and different 
applications and different product constraints will influence 
the number and combination of properties embodied in a 
particular example of the present invention. 

2.2 Strategy: Ownership Annotations 
25 According to some embodiments, the present invention 
includes a principled two-pronged approach for extracting 
statically a runtime architecture of a system: a) assume devel- 
opers add to the source code annotations to clarify some 
architectural intent; and b) use a sound static analysis that 
30 leverages the annotations and the code to extract a sound 
runtime architecture. 

For adoptability, the annotations we propose to use are not 
radical language changes and do not affect the system’s runt- 
ime semantics. The annotations support existing object-ori- 
35 ented languages, design idioms, frameworks and libraries. 
Instead of specifying components and connectors directly in 
code [J. Aldrich, C. Chambers, and D. Nothin. ArchJava: 
Connecting Software Architecture to Implementation. In 
ICSE. 2002 .] . the annotations specify and enforce the sharing 
40 of data between objects and constrain how the program can 
alias objects [J. Aldrich and C. Chambers. Ownership 
Domains: Separating Aliasing Policy from Mechanism. In 
ECOOP, 2004.], which is a significant problem in creating 
architectural models. 

45 In object-oriented programs, this state sharing is often not 
explicit but instead is implicit in the structure of references 
created at runtime. The idea of using annotations to recover a 
design from the code is not new [P. Lam and M. Rinard. A 
Type System and Analysis for the Automatic Extraction and 
50 Enforcement of Design Information. In ECOOP, 2003.]. But 
previous annotation-based systems did not specify the runt- 
ime instance structure or data sharing precisely and did not 
handle inheritance [P. Lam and M. Rinard. A Type System 
and .Analysis for the Automatic Extraction and Enforcement 
55 of Design Information. In ECOOP, 2003.]. 

Ownership type annotations are appealing because they 
track objects not classes (related to ownership types, confined 
types track classes not objects [B. Bokowski and J. Vitek. 
Confined Types. In OOPSL4, 1999.]) and provide some pre- 
60 cision about aliasing [D. Clarke, J. Potter, and J. Noble. Own- 
ership Types for Flexible Alias Protection. In OOPSLA, 
1998.], [C. Boyapati, B. Liskov, and L. Shrira. Ownership 
Types for Object Encapsulation. In POPL, 2003.], [J. Aldrich 
and C. Chambers. Ownership Domains: Separating Aliasing 
65 Policy from Mechanism. In ECOOP, 2004.] . Moreover, a type 
system checks the annotations at compile-time. Different 
ownership type systems have various degrees of expressive- 
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ness, but all support making an object owned by, i.e., part of 
another object’s representation, to enforce instance encapsu- 
lation. This is a stronger guarantee than changing the visibil- 
ity of a field by marking it private; the latter does not prevent 
a developer from defining a public method that returns an 5 
alias. 

Ownership is one form of containment. There are several 
notions of ownership. For instance, shallow ownership stops 
an object from referencing another obj ect, whereas deep own- 
ership stops an object from referencing other another object’s 10 
transitive representation. We describe an embodiment of the 
present invention in terms of ownership, but the underlying 
ideas apply to other fonns of containment. 

2.3 Ownership Domains Overview 15 

The ownership domains type system uses annotations on 
the reference types in the program to make the state sharing 
between objects more explicit. A developer indicates what 
domain an object is part of by annotating each reference to 
that object in the program [J. Aldrich and C. Chambers. 20 
Ownership Domains: Separating Aliasing Policy from 
Mechanism. In ECOOP, 2004.]. 

Brief Review. 

An ownership domain is a conceptual groups of objects 
with an explicit name and explicit policies that govern how it 25 
can reference objects in other domains. Each object belongs 
to a single ownership domain that does not change at runtime. 

Ownership domains may be declared at the top level of the 
application or within an object. Each object can declare one or 
more public or private domains to hold its internal objects, 11 1 
thus supporting hierarchy. Permission to access an object 
implies permission to access its public domains. Two objects 
can access objects in the same domain by declaring a formal 
domain parameter on one object and binding that formal ,,- 
domain parameter to another domain as long as the permis- 
sions allow that access. Finally, objects inside a private 
domain are encapsulated — unless a policy explicitly links a 
domain parameter to the private domain. 

FIG. 3a shows object O declaring a private domain P and a 40 
public domain D. Another object B can access objects inside 
domain D. An object A cannot access objects inside domain P. 

But within object O, object F inside domain D can access 
object G inside domain P. Objects in both the public and the 
private domain can access external objects. 45 

Ownership domains can enforce a strict owner-as-domina- 
tor discipline as follows: a) declare a domain as private; and b) 
do not declare a domain link from a formal domain parameter 
to a private domain [5] . In that case, the object is fully encap- 
sulated. 50 

Ownership domains generalize ownership types [D. 
Clarke, J. Potter, and J. Noble. Ownership Types for Flexible 
Alias Protection. In OOPSLA, 1998.], by separating the 
mechanism for dividing objects into hierarchical groups, and 
the policy for constraining references between objects in 55 
those groups. In previous ownership systems, ownership is a 
relation between objects, such that one object owns another. 
The ownership domains system generalizes this so that an 
object defines multiple domains in which other objects reside. 

For the purposes of the present invention, we consider the 60 
ownership relation in previous systems to be a special case of 
ownership domains in which there is a single distinguished 
domain declared per object, in which we place all the objects 
that in the previous system are are owned by the current object 
“this.” The “this” variable, also called “self’ in some com- 65 
puter programming languages, is used to refer to the current 
object, i.e. the receiver of a method. 
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Relevance to Architectures. 

We propose a straightforward mapping between ownership 
domains and the architectural concepts we discussed earlier. 
Ownership domain annotations support abstract reasoning 
about data sharing by assigning each object in the runtime 
object graph to a single ownership domain. 

The ownership domains declared at the top level map to the 
system’s runtime tiers. Ownership domains declared within 
an object express a sub -architecture within the object, one 
that consists of other objects that represent its parts. This 
hierarchical containment relation enables architectural 
abstraction: the top-level domains may have only a few archi- 
tecturally relevant objects, i.e., components. And each of 
those components can be made up of more objects represent- 
ing subcomponents and so on, until low-level, less architec- 
turally relevant objects are reached. No programming lan- 
guage has an explicit tier construct, but ownership domain 
annotations can express and enforce a tiered architecture in 
code [J. Aldrich and C. Chambers. Ownership Domains: 
Separating Aliasing Policy from Mechanism. In ECOOP, 
2004.], [M. Abi-Antoun and J. Aldrich. Ownership Domains 
in the Real World. In IWACO, 2007.] . 

The annotations also describe policies that govern refer- 
ences between ownership domains. Objects within the same 
ownership domain can refer to one another. But references 
can only cross domain boundaries if there is a domain link 
between the two domains [J. Aldrich and C. Chambers. Own- 
ership Domains: Separating Aliasing Policy from Mecha- 
nism. In ECOOP, 2004.]. Each object can declare a policy to 
describe the permitted aliasing among objects in its internal 
domains, and between its internal domains and external 
domains. ADLs typically express such policies using con- 
straints. 

In short, the annotations specify and enforce in code, archi- 
tectural intent related to object encapsulation, logical con- 
tainment (hierarchy), architectural tiers and object commu- 
nication permissions. 

EXAMPLE 

FIG. 4a shows two classes with ownership domain anno- 
tations. In the description of the present invention, we use a 
simplified syntax similar to Java generics, but the concrete 
syntax uses existing language-support for annotations [M. 
Abi-Antoun and J. Aldrich. Ownership Domains in the Real 
World. In IWACO, 2007.]. Domain names are arbitrary (ex- 
cept for a few special annotations); we use capital letters to 
distinguish them from other identifiers. 

A DataAccess instance holds Integer and Number objects 
that must be accessible to other objects that have access to it. 
So class DataAccess declares a public domain STATE and 
declares those objects inside STATE. In addition, DataAccess 
requires some environment state that it does not own, so it 
declares a domain parameter PENV. Domain parameter 
PENV declares a field v2, a reference to a list of Integer 
objects. The outer PENV annotation is for the list object itself; 
the inner PENV annotation is for the list elements, the Integer 
objects. Some other object of type UnitTest, which has the 
ENV domain, binds its ENV domain to DataAccess’ s PENV 
domain parameter so that both objects can share the same 
environment state. 

Next, we use the ownership domain annotations to extract 
statically a sound runtime architecture of a system. 
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3 Analysis 

The analysis uses one or more different intermediate rep- 
resentations to extract the runtime architecture. According to 
one embodiment, the present invention first builds an abstract 
graph, converts it into a runtime graph, and the runtime graph 
is used as the representation of the runtime architecture. 
According to another embodiment, the runtime graph is used 
as an intermediate representation and the runtime graph is 
used to build an Ownership Object Graph (OOG) (the OOG 
will also be referred to as a display graph). Other variations 
and embodiments are also possible with the present invention. 

3.1 Abstract Graph 

The abstract graph is created from the computer program 
for which a runtime architecture is to be created, as well as 
from containment information corresponding to the com- 
puter program. The abstract graph may be created from many 
forms of computer programs, such as, but not limited to, 
source code, executable code, and byte code. As will be 
described in more detail herein below, the abstract graph has 
nodes including types and objects, and the abstract graph has 
at least one edge connecting one node to another node (See 
FIG. 5(a)). The nodes comiected by the edge may be the same 
node connected to itself, or the edge may connect different 
nodes. In addition to connecting nodes, the edges may also be 
between types, between objects, and between types and 
objects. 

The abstract graph has the abstract domains declared in 
each abstract type. Each abstract domain represents fields and 
variables declared inside it as abstract objects. A visitor builds 
the abstract graph from the Abstract Syntax Tree (AST) of the 
annotated program (it must also account for certain charac- 
teristics discussed in Section 4). 

FIG. 4b illustrates one embodiment of the abstract graph 
410 of the DataAccess system. UnitTest, DataAccess, and 
ArrayList<Integer> each represent a type with white-filled 
domains (ENV, DATA, STATE, PENV, and ELTS) declared 
inside them, and grey objects declared inside each domain. A 
formal domain parameter such as PENV 424 has a dotted 
border. Actual ownership domains such as STATE 422 and 
ENV 414 have a dashed border. A thick dotted edge repre- 
sents a type relationship (is-a). A solid edge represents a field 
reference (has-a). 

The abstract graph 410 includes types, objects, and 
domains. UnitTest 412 is a type that declares two domains, 
ENV 414 and DATA 416 . Domain DATA 416 declares object 
dataAccess 418 of type DataAccess 420 . Type DataAccess 
420 declares two domains, STATE 422 and PENV 424 . 
STATE 422 is an actual domain that declares two objects, int 1 
426 of type Integer and numl 428 of type Number, respec- 
tively. PENV 424 is a formal domain parameter that declares 
an object v2 430 of type Array Tist<Integer> 432. Type 
ArrayList<Integer> 432 declares a fonnal domain parameter 
ELTS 434 , that contains an object Integer 436 of type Integer. 
The declaration of ArrayList<Integer> 432 is based on jav- 
a.util.ArrayList but is simplified. The types for Integer and 
Number are not shown. In some applications the abstract 
graph may adequate or useful in and of itself, while in other 
applications the abstract graph is only an intermediate step. 
For example: 

No Instances. 

An abstract graph is not hierarchical in the sense of an 
object having children. Rather, an abstract object has an 
abstract type, an abstract type has abstract domains, and an 
abstract domain has abstract objects. For example, abstract 
object dataAccess 418 has type DataAccess 420 , and abstract 
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type DataAccess 420 has domains STATE 422 and PENV 
424 , and abstract domain STATE 422 contains the abstract 
object inti integer 426 . 

Aliasing Unaware. 

5 An abstract graph does not reflect possible aliasing. The 
ownership domains type system guarantees that two objects 
in different domains can never alias, but two objects in the 
same domain may alias. As discussed in Section 2.1, if two 
objects could be aliased, the architecture must show them as 
to one. In abstract graph 410 above, inti :Integer 426 and numl : 
Number 428 in the STATE domain 422 may refer to the same 
object. 

Incomplete. 

An abstract domain in an abstract graph does not directly 
15 show all the objects that are in a given domain. It contains 
abstract objects only for the locally declared fields. E.g., the 
abstract graph 410 , DataAccess 420 declares its v2: ArrayList 
field 430 in its domain parameter PENV 424 . Such non-local 
fields do not appear where the domain is declared. Hence, the 
20 empty ENV 414 domain inside UnitTest 412 . 

So the analysis converts an abstract graph into a runtime 
graph to approximate the true runtime object graph (ROG). 
We will explicitly refer to the true runtime object graph 
(ROG) to avoid confusion with the runtime graph. A dynamic 
25 analysis take the runtime graph as input and does not have to 
compute it. 

3.2 Runtime Graph 

A runtime graph is created from an abstract graph and it 
includes at least one object including information from at 
30 least one of the types in the abstract graph (See FIG. 5(b)). 

The runtime graph may be created, for example, by creat- 
ing a runtime object from an object type and a containing 
object type, and by creating a containment edge associated 
with one or more runtime objects. In other words, creating a 
35 containment edge with two endpoints, wherein each endpoint 
is associated with a runtime object. The endpoints of the 
containment edge may both be associated with the same 
mntime object, or the containment edges may be associated 
with different runtime objects. 

40 For example, one containment edge may be associated 
with runtime object A, and the other containment edge may be 
associated with runtime object B, and runtime objects A and 
B may be the same runtime object or different runtime 
objects. Creating a containment edge may, for example, 
45 include creating a containment edge between runtime object 
A and runtime object B (which may be the same runtime 
object or different runtime objects) when the type corre- 
sponding to runtime object A in the abstract graph contains 
the abstract object corresponding to runtime object B. 

50 In other words, a runtime graph instantiates the types in an 
abstract graph and shows only objects and domains. Each 
runtime object contains runtime domains and each runtime 
domain contains runtime objects. Thus, in a runtime graph, 
one can view the children of an object without going through 
55 its declared type. Furthermore, to support the goals in Section 
2.1, a runtime graph must address object merging, object 
aliasing, object pulling and object edges. We discuss each one 
in turn. 

We represent ownership domains as in FIG. 3 b. A dashed 
60 border white-filled rectangle represents an actual ownership 
domain 310 . A solid border grey-filled rectangle with a bold 
label represents an object 312 . A dashed edge represents a 
link permission 314 between two ownership domains. A solid 
edge represents a creation, usage, or reference relation 316 
65 between two objects. A dotted border white filled rectangle 
represents a fonnal domain 318 . An object labeled “obj: T” 
indicates an object of name “obj” and type T as in UML object 



US 8,584,099 B1 


13 

diagrams. The symbol (+) is appended to an object’s label 
when its substructure is elided, as will be discussed later. 

Before discussing merging and pulling, we discuss FIG. 6 a 
which illustrates one embodiment of a partial runtime graph 
610 for DataAccess OOG without merging and without pull- 5 
ing. Object test 612 is an instance of the Unit Test type (412 in 
FIG. 4b) from the abstract graph 410. Object test 612 declares 
two domains DATA 416 and ENV 414. Domain DATA 416 
contains an object dataAccess 418 of type DataAccess. 
Object dataAccess 418 declares two domains, STATE 422 to 
and PENV 424. Domain STATE 422 contains two objects, 
intl426 andnuml 428. The formal domain parameter PENV 
424 contains an object v2 430. In this intermediate represen- 
tation 610, objects Inti 426 and numl 428 are not merged yet, 
even though they may alias because Integer is a subtype of 15 
Number. In addition, object v2 430 is not yet pulled from the 
formal domain parameter 424. Merging, Aliasing, and Pull- 
ing will now be discussed. 

Object Merging. 

Different executions may generate a different number of 20 
objects, but an architecture must represent all possible execu- 
tions. To address this, a runtime graph summarizes multiple 
runtime objects with a canonical runtime object. Further, 
exactly one obj ect in a runtime graph represents each object in 
the ROG. 25 

For instance, a dynamic analysis might display individual 
cells in a linked list of Integer objects, as consl :Cons, cons2: 
Cons, as shown in FIG. 7(a). In our approach, all the Cons 
cells would get unified into a consl :Cons object and a self- 
edge would represent the reference to the next cell, as shown 30 
in FIG. 7(b). 

Object Aliasing. 

What about Aliasing? When converting abstract objects 
from the abstract graph into runtime objects, the analysis 
merges two abstract objects in the same domain, if their types 35 
are related by inheritance. The ownership domains type sys- 
tem guarantees that two objects in different domains can 
never alias. 

FIG. 6 b illustrates another embodiment of the runtime 
graph 610 for the annotated code in FIG. 4 a in which two 40 
objects are merged. One runtime object 614, labeled with 
numl:Number, merges the abstract objects inti and numl 
(shown as 426 and 428, respectively, in FIG. 4b) in domain 
STATE 422. The merging is performed because Integer is a 
subtype of Number. The runtime graph 610 still shows object 45 
v2 430 in its formal domain parameter PENV 424, before it is 
pulled. 

Object Pulling. 

How to handle ownership domain parameters? For sound- 
ness, each runtime object that is actually in a domain must 50 
appear in that domain in the runtime graph. To ensure this 
property, an abstract object declared inside a formal domain is 
pulled into each actual domain that is bound to the formal 
domain parameter. Hie term pulling generally means copying 
(i.e., not moving) an object from one domain to another. 55 

FIG. 6c illustrates another embodiment of the runtime 
graph 610 with merging and with pulling. Object v2 430 was 
pulled from the formal domain parameter PENV (424 in FIG. 

4b) to the actual domain ENV 414 in object test 612. This 
figure no longer displays the PENV formal domain parameter 60 
(424 in FIG. 4b) for clarity. Stated another way, FIG. 6b 
shows object v2 430 in the formal domain parameter PENV 
(dotted border 424 in FIG. 4b). In FIG. 6c, object v2 430 is 
pulled from the formal domain parameter PENV (shown as 
424 in FIG. 4b) to the actual domain ENV 414 inUnitTest (the 65 
former is bound to the latter using the annotation 
DataAccess<ENV> in FIG. 4a). In most cases, we elide for- 
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mal domains after pulling, so FIG. 6c no longer displays 
PENV (424 in FIG. 4b). Similarly, an ArrayList<Integer> 
object 430 has a domain parameter ELTS that contains Inte- 
ger objects; those get pulled from ELTS into ENV (See FIG. 

10 ). 

Object Edges. 

Finally, the analysis adds field reference edges to the Runt- 
imeGraph 610, shown as solid edges in FIG. 6c. For instance, 
DataAccess declares the two fields inti and numl in domain 
STATE. Objects Inti and numl were merged so there is a field 
reference edge from a DataAccess object to the merged 
object. It is also possible to add usage edges that show field 
accesses or method invocations. 

3.3 Ownership Object Graph (OOG) (Display Graph) 

A display graph (also referred to as the OOG) is a hierar- 
chical, depth limited representation of a runtime graph (See 
FIG. 5(c)). The degree to which the display graph is depth 
limited can vary depending on the particular needs and the 
particular applications of the invention. In other words, the 
depth of the display graph may be greater than or less than that 
described herein. Furthermore, in some embodiments, the 
display graph is not depth-limited, such as when there are no 
recursive types. 

A runtime object can contain itself, so the runtime graph 
must represent a potentially unbounded ROG with a finite 
representation. For example, consider a class QuadTree that 
declares several fields of type QuadTree in its owned domain 
(owned is a default private domain that need not be declared 
[J. Aldrich and C. Chambers. Ownership Domains: Separat- 
ing Aliasing Policy from Mechanism. In ECOOP, 2004.]): 

class QuadTree { 

owned QuadTree_nwQuadTree; 

r 

Since there is a unique canonical object for each type in 
each domain, the object representing QuadTree in domain 
owned must also represent the child object of type QuadTree 
in the owned domain of the parent; it is therefore its own 
parent in this representation. A finite representation is essen- 
tial to ensure that the analysis terminates. But, in a hierarchy, 
no object is its own parent. So the analysis creates the OOG as 
a finite depth-limited unrolling of the runtime graph. In the 
example above, we show one QuadTree object within 
another, down to a finite depth. 

FIGS. 5(d). 5(e) and 5(f) show the abstract graph, the 
runtime graph and the display graph for the QuadTree 
example. 

Cycle Detection. 

To break the recursion in the runtime graph, the analysis 
that generates the OOG stops when, from a given runtime 
object, it reaches the same runtime object a second time. 
Unlike the runtime graph, the OOG is a strictly hierarchical 
structure. 

Edge Summaries. 

The OOG is depth-restricted but must still show all rela- 
tions that exist at runtime. Merely truncating the recursion 
may fail to reveal all relations. For instance, child objects in a 
hierarchy may have fields that point to external objects, and 
the child objects may be beyond the visible depth. The analy- 
sis automatically adds summary edges from the parent 
objects to those external objects. 

If the user-specified projection depth is d, the analysis 
projects the runtime graph to a depth of d+n in order to 
produce summary edges that are due to the nodes below the 
cutoff depth. Because of the regularity of the projection tech- 
nique, we hypothesize that: for any graph, there is a fixed 
depth n that is sufficient to produce all such summary edges, 
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and thus to produce a sound display graph. However, we do 
not yet have a proof of this nor a way of calculating the 
constant n. 

For example, consider a list of Integer objects in FIG. 7(c). 
Object test 712 contains two domains, LIST 714 and DATA 
716. Domain LIST 714 declares object 1st 718, a list of 
Integer objects of type IntList. Object IntList has a public 
domain ITERS 720 for its iterators and a private domain 
OWNED 722 to hold the representation of the linked list. The 
DATA domain 716 contains the list elements, i.e., Integer 
objects. After pulling, the head of the list current refers to an 
Integerobject 724 in the DATAdomain716 containing the list 
elements. (This is a canonical example to evaluate an owner- 
ship type system’s expressiveness; the annotated code is in 
the ownership domains paper [J. Aldrich and C. Chambers. 
Ownership Domains: Separating Aliasing Policy from 
Mechanism. In ECOOP, 2004.]). 

In FIG. 7(d), the projection depth is reduced to elide the 
substructure of IntList 718. The analysis adds a summary 
edge 726 from the IntList object 718 to the pulled Integer 
object 724. The symbol (+) is appended to the object’s label. 

Recapitulation. 

The OOG is a graph with two types of nodes, objects and 
domains. The nodes form a hierarchy where each object node 
has a unique parent domain and each domain node has a 
unique parent object. The root of the graph is a top-level 
domain. There are two edge types. Edges between objects 
correspond to field reference or usage relations. Edges 
between domains correspond to domain links. 

Finally, the OOG handles object merging, object aliasing 
and object pulling. Compared to earlier definitions of object 
graphs [J. Potter, J. Noble, and D. Clarke. The Ins and Outs of 
Objects. In Australian Softw. Eng. Con/, 1998.], the OOG 
explicitly represents domains (sometimes called “clusters”) 
of objects and edges between these domains (sometimes 
called “domain links”). In contrast to other object ownership 
hierarchies [T. Hill, J. Noble, and J. Potter. Scalable Visual- 
izations of Object-Oriented Systems with Ownership Trees. 
J. Visual Languages and Computing, 13(3), 2002.], [A. Pota- 
nin, J. Noble, and R. Biddle. Checking Ownership and Con- 
finement. Concurrency and Computation: Practice and 
Experience, 16(7), 2004.], in an OOG, the owner of an object 
is a domain not another object. 

Finally, in the description of the present invention, we 
sometimes elide the root domain and root object from the 
displayed OOGs for readability. Often times, the root object is 
an instance of a fake class whose sole purpose is to declare the 
top-level domains [M. Abi-Antoun and J. Aldrich. Ownership 
Domains in the Real World. In IWACO, 2007.]. 

FIG. 6 d illustrates one embodiment of the complete 
dataAccess OOG or display graph 650. In addition to the 
domains and objects previously discussed, the ENV domain 
414 declares an object Integer 630. Object Integer 630 was 
pulled from the ELTS formal domain parameter (424 in FIG. 
4 b) to the ENV domain 414. The bold edge 632 marked with 
the symbol X is an example of an edge that is not added, even 
though Integer 430 is a subtype of Number 614 because the 
ELTS formal domain parameter (424 in FIG. 4 b) is not bound 
to the STATE domain 422. This makes OOG edges more 
precise than those obtained by superimposing field reference 
edges based on the associations in a class diagram. 

3.4 Rewriting Rules 

In this section, we formally describe the analysis. We use a 
labeled record notation for the data type declarations of the 
AbstractGraph and the RuntimeGraph (FIGS. 8a-8c). In par- 
ticular, FIGS. 8a-8c illustrate data type declarations and 
rewriting rules to convert an AbstractGraph into a Runt- 


16 

imeGraph. We use (...) for a tuple, {o . . . } for a set and 
[d . . . ] for a sequence. We use <: to denote standard subtyp- 
ing. We sometimes qualify a domain d by the type T that 
declares it as T::d. We describe the algorithm to construct a 
5 RuntimeGraph from an AbstractGraph using small-step 
rewriting rules (FIGS. 8a-8c). We refer to each rule by name 
in the description (using small caps). 

To help keep the representations distinct, we use English 
letters(o,d, . . .) for elements oftheAbstractGraphand Greek 
to letters (0, . . . ) for elements of the RuntimeGraph. The 
AbstractGraph consists of the Abstract Types in the program, 
the AbstractDomains declared in each type and the Abstrac- 
tObjects declared in each domain. Each AbstractObject main- 
tains bindings, each from a formal to an actual domain, shown 
15 as (df orma[ —*d actual ) to avoid ambiguity. 

To avoid extra copying, we directly add AbstractDomains 
to the RuntimeGraph. A RuntimeObject knows what 
AbstractDomain owns it and maintains a set of AbstractOb- 
jects it merges. Given the list of all RuntimeObjects ({o, . . . }, 
20 d) in the RuntimeGraph, the RuntimeObjects that are in a 
given AbstractDomain d r are those that have d=d T . 

The analysis obtains the AbstractDomains inside a Runt- 
imeObject 0 from each AbstractObject o,.: T, that 0 merges, 
the declared AbstractType T, of each o*, and each Abstract- 
25 Domain d, that ^ declares. 

The RuntimeGraph keeps track of all RuntimeObjects and 
RuntimeEdges generated to date. The algorithm works by 
applying these rules until it can no longer generate new facts, 
i.e., RuntimeObjects and RuntimeEdges. Some rules remove 
30 existing facts using a replacement operation. Despite this 
non-monotonicity, the algorithm is stable because rule pre- 
conditions prevent regenerating facts that have been replaced. 

For a given input, the rules will always produce the same 
output, regardless of the (potentially non-determini Stic) order 
35 in which the re-write rules are applied. A different execution 
of the algorithm may produce a graph that is labeled differ- 
ently because an object might have multiple types, and one of 
those types is picked as the label. Thus, determining that two 
outputs of the method are the same is a graph homomorphism 
40 problem. 

Objects and Domains. 

The analysis takes as inputs a top-level AbstractDomain 
d root and a top-level AbstractObject o roo; . Since all the 
AbstractDomains are represented in the RuntimeGraph, the 
45 analysis converts AbstractObjects into RuntimeObjects, 
starting with o mor 

Rule R-Aux-Compat defines type compatibility: the first 
two disjuncts are necessary to handle potential aliasing, the 
third disjunct is a heuristic which we discuss in Section 4.2 
50 and can be turned off. 

Before creating a RuntimeObject for an AbstractObject o 
of type t in AbstractDomain d, the analysis checks if d already 
has a RuntimeObject 0 of type t' where t and fi are compatible 
according to R-Aux-Compat. If not, the analysis creates a new 
55 RuntimeObject (R-New-Object), which we represent as 
0=({o . . . }, d). If a RuntimeObject 0 exists, i.e., 0= 
({°'iW • • • }, d), then the analysis replaces 0 with a new 
RuntimeObject that also merges o, i.e., 0=({o' lrf . • • }, d), using 
Rule R-Merge-Objects. 

60 .An object about to be created in a domain may have a type 

that is compatible with two existing RuntimeObjects that are 
not compatible with each other. In this case, the new object 
merges (nondeterministically) with one of the existing 
objects, and then merges with the other using R-Merge-Ex- 
65 isting. This fixes an unsoundness triggered by multiple inter- 
face inheritance in our earlier algorithm [M. Abi-Antoun and 
J. Aldrich. Compile-Time Views of Execution Structure 
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Based on Ownership. In IWACO, 2007.]. For example, con- 
sider the following: Q TooI ={{o TooI , o VCL , . . . }, d c ) and Q Cmd = 
({°cw: Ofcz.I- d c) with o Cmd : Command, o Tool \ Tool and 
o vcl- VCL Command <: VCL and Tool: VCL but neither 
Tool <: Command nor Command <: Tool. In JHotDraw, 5 
VCL=ViewChangeListener and d c =Controller. R-Merge- 
Objects replaces Q TooI and Q Cmd with 0 T ooicmd=({ Q cm* °Tooi > 
°vcl-> • • • }? d c ). 

Finally, the analysis pulls up each RuntimeObject 0 from 
its owning formal domain dyinto a domain d a that is bound to to 
df (R-Pull-Ob ject), again replacing RuntimeObjects as nec- 
essary. FIG. 9 illustrates the rule. In particular, FIG. 9 illus- 
trates the pulling operation. Object oPull 912 is pulled from 
the formal domain parameter PARAM 914 to the domain 
ACTUAL 916 . Hie binding information, shown between 15 
brackets, binds the formal domain parameter PARAM 914 to 
the domain ACTUAL 916 , and is retrieved from object oPar- 
ent 918 , one of the ancestors of object oPull 920 . The domain 
ACTUAL 916 can be the same as the domain PARENT 922 , 
in which case, object oPull 920 becomes a peer of oParent 20 
918 . 

Edges. 

Because a RimtimeObject may get replaced, a Runt- 
imeEdge is defined as a source path, a source type, a target 
path and a target type — rather than source and target Runt- 25 
imeObjects in our earlier pseudocode [M. Abi- Antoun and J. 
Aldrich. Compile-Time Views of Execution Structure Based 
on Ownership. In IWACO, 2007.]. A path is a sequence of 
type-qualified domains to traverse to locate an object. 

FIG. 10 shows the DataAccess OOG with more formal 30 
domains, to illustrate transitive pulling. Object 1010 is in the 
formal domain parameter ELTS 1014 . Object 1012 was 
pulled from the formal domain parameter ELTS 1014 into the 
PENV formal domain 1016 parameter since ELTS 1014 is 
botmd to PENV 1016 . In turn, Object 1018 was pulled from 35 
the formal domain parameter PENV 1016 to the actual 
domain ENV 1020 . 

In FIG. 10 , the path [::lent,UnitTest::DATA] and the type 
DataAccess uniquely identify the dataAccess object. For 
soundness, when creating the OOG, the analysis adds an edge 40 
from or to each object in the source or target path that is type 
compatible with the source or target type (using R-Aux-Com- 
pat). 

The judgement for creating a RuntimeEdge is of the fomi 
originObject@[ . . . U c -»[ . . . ] desr First, Rule R-New-Edge 45 
creates a new RuntimeEdge by prefixing both the source and 
the destination paths with the owning domain. Rules R-Path- 
Subst-L and R-Path-Subst-R substitute formals with actuals 
in the paths based on the binding information in the origin 
RuntimeObject. Finally note that a RuntimeEdge can have 50 
shorter domain paths as a result of the substitution. 

Edges and Path-Dependent Domains. 

The ownership domains type system allows path-depen- 
dent annotations that are of the form obj 1 .obj2 . . . DOMAIN, 
where objl, obj2, etc., are chains of final fields or variables, 55 
and DOMAIN is a public domain declared on the type of the 
last object in the path. Handling these path-dependent 
domains requires extending the formal system to define a 
Binding as a mapping from a formal AbstractDomain to a 
sequence of AbstractDomains instead of a single AbstractDo- 60 
main, in order to generate longer paths for RuntimeEdges. We 
did not include this feature into the core formal system for 
simplicity, but extending the data structures and the rules is 
straightforward. 

Domain Links. 65 

The analysis adds to the runtime graph RuntimeLinks in a 
similar maimer to RuntimeEdges. Observe, in FIGS. Sa-Sc , a 
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RuntimeLink is a simple kind of RuntimeEdge, one that only 
has source and destination domain paths without source and 
destination types. Those are already computed when comput- 
ing RuntimeEdges. Moreover, the underlying ownership 
domains type system guarantees that objects communicate 
only when permitted, so RuntimeEdges cannot violate Runt- 
imeLinks. 

Illustrative Example 

FIGS, llz-llc show a few representative rewrites that the 
reader can follow along with the OOG in FIG. 10, which 
shows formal domains to clarify the binding of formals to 
actuals and the object pulling operation. 

The top-half of the diagram shows selected AbstractGraph 
tuples. The lower-half shows selected RuntimeObjects and 
RuntimeEdges that the rewriting rules create, 3-< denotes the 
next generated fact, and s-t | denotes the fact obtained at the 
fixed point. 

We manually added to FIG. 1 0 a thick edge labeled X as an 
example of an imprecise edge that the rules do not generate. 
Intuitively, ELTS is not bound to STATE so no rule should 
ever add an edge from v2:ArrayList to numl:Number in 
STATE — even though Integer is a subtype of Number. The 
rewriting rules here generate edges that are more precise than 
our earlier algorithm [M. Abi-Antoun and J. Aldrich. Com- 
pile-Time Views of Execution Structure Based on Ownership. 
In IWACO, 2007.]. 

3.5 Soundness 

For the OOG to be most useful, it should be a sound 
approximation of the true runtime object graph for any pro- 
gram run. We formally revisit the earlier definitions in Section 
2.1 and formally prove a key property. Since the OOG is just 
a depth-limited projection that should preserve the soundness 
of the runtime graph by adding summary edges, the true 
runtime object graph (ROG) relates to the RuntimeGraph as 
follows: 

Unique Representatives: 

Each object 11 in the ROG is represented by exactly one 
representative in the RuntimeGraph. Similarly, each domain 
in the ROG is represented by exactly one domain in the 
RuntimeGraph. Furthermore, if object II is owned by domain 
d in the ROG, then the representative of II is owned by the 
representative of e in the RuntimeGraph. Similarly, if II has a 
domain d in the ROG, then the representative for II has a 
representative domain for d in the RuntimeGraph. 

Edge Soundness: 

If there is a field reference from object lj I to object 1 2 in the 
ROG, then there is a field reference edge between Runtim- 
eObjects 0| and 0 2 corresponding to 1, and 1 2 in the Runt- 
imeGraph, and similarly for domain links; 

Soundness Proof. 

The proof builds on the formalization of ownership 
domains using Featherweight Domain Java (FDJ) [J. Aldrich 
and C. Chambers. Ownership Domains: Separating Aliasing 
Policy from Mechanism. In ECOOP, 2004.] . FIG. 12 shows a 
subset of the store typing rule T-Store. An overbar represents 
a sequence. In FDJ, locations represent object identity. A 
store S maps locations II to their contents: the class of the 
object, the actual ownership domain parameters, and the val- 
ues stored in its fields. A type in FDJ is a class name and a set 
of actual ownership domain parameters. S[l|] denotes the 
store entry for 1| . Given an object in the runtime object graph 
represented by location 1, 2[l]=C<rjn>| . Here, each 
l',.,] refers to a domain named n, that is part of the runtime 
object l',.| . By Rule Aux-Owner, the first actual domain is the 
owner, i.e., owner(C<I\n>)=l' 1 .n 1 (the formal system treats 
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the first domain parameter of a class as its owning domain; the 
practical system uses a slightly different syntax to emphasize 
the semantic difference between the owner domain of an 
object and its domain parameters.). S[l,i] denotes the value in 
the 1 th field of S[l| ]. T-Store ensures that the store type 3 gives 5 
a type to each location in S, one that is consistent with the 
classes and actual ownership domain parameters in S. CT is 
the class table. 

Proof of Unique Object Representatives. 

The proof is by induction over the ownership tree (from to 
FD J) . If \ a | owns 1 1 , the ownership relation has no cycles and 
is well-founded. The base case for the induction is trivial. The 
top-level object in the runtime object graph has a unique 
representative in the RuntimeGraph corresponding to the root 
RuntimeObject. We strengthen the inductive hypothesis (i.h.) 15 
as follows : Each obj ect in the runtime obj ect graph of runtime 
type C is represented by exactly one RuntimeObject 0 that 
merges an AbstractObject o of type C in the Runtime Graph. 

The proof required the following lemmas which are well- 
formedness rules on the RuntimeGraph. 20 

Lemma: Unique Object per Domain and Type. 

If there exists a RuntimeObject 0=({o: T, . . . }, d) and a 
RuntimeObject 0'=({o: T', . . . }, d) with T'<: T or T <: T', then 
0 is the same as 0'. 

Proof. 25 

Immediate from R-Merge-Objects. Note, this proof does 
not reflect the existence of virtual abstract objects. Although 
those virtual abstract objects appear in the RuntimeGraph 
(and violate the uniqueness invariant), they are omitted from 
OOG. 30 

Lemma: Object Pulling. 

If 3 RuntimeObject({o . . . }, d^) and d^i-»*d^| then 3 
RuntimeObject({o . . . }, d^). 

Proof. 

By induction on the length of the binding sequence. 35 

Lemma: Ownership Tree. 

Tree structure must follow order of object creation. 

Proof. 

Because the owner of an object must be set at creation time, 
and has to be an existing domain on an existing object, the 40 
ownership relation is well-founded and has no cycles. Note, 
this assumes that unique is not part of the system. Having 
unique could cause the creation of cycles because it leads to 
the assignment of ownership after creation. 

Lemma: Binding Chains. 45 

Given a location 1 such that 2[l]=C<l 0 ,n 1 , . . . >, and 1 0 
| such that Z[1 0 ]=T 0 < . . . >| . If the corresponding object 
creation expression is of the form new C<5>(u) with d[ a 
formal domain, there exists a sequence of new C 1 <T)>(ij) . . . 
new C„<T>(v~„), with CT(C i )=C i <a> and a chain of bindings 50 
CGcqwCpXq, . . . , C„::a,4-> T^mJ . 

Proof. 

By induction on the evaluation rules (using rules R-New 
and T-New) in FDJ. 

Limitations. 55 

The proof assumes that objects are only created in locally 
visible domains or domain parameters: it does not reflect the 
existence of the lent, shared and unique domains [J. Aldrich 
and C. Chambers. Ownership Domains: Separating Aliasing 
Policy from Mechanism. In ECOOP, 2004.]. Indeed, the 60 
OOG may not reflect an object marked unique until it is 
assigned to a specific domain. Thus, an inter-procedural flow 
analysis is needed to track an object from its creation (at 
which point it is unique) until its assignment to a specific 
domain. The current tool does not implement this flow analy- 65 
sis, so a unique obj ect returned from a factory method must be 
annotated with the domain in which it should be displayed. 
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Similarly, the flow analysis can determine what domain a lent 
object is really in. Again, objects annotated with lent, except 
for the root object, are currently missing from the OOG. 
Objects that are shared would be trivial to display in the OOG 
but would add many uninteresting edges (the analysis may 
also excessively merge objects in the shared domain), so we 
currently exclude them. 

Assumptions. 

The OOG inherits other properties that are guaranteed by 
the soundness of the underlying ownership type system. For 
example, every object is assigned an owning domain which is 
consistent with all program annotations and does not change 
over time. These invariants are correct up to the following 
assumptions: 

a. All Sources Available: 

The program’s whole source code is available, and the 
program operates by creating some main object and calling a 
method on it (this justifies the focus on a single root object, 
although multiple root objects could in principle be shown). 
The class of that main object is the type of the root of the 
OOG; 

b. No Reflective Code: 

Reflection and dynamic code loading may violate the 
above invariants by introducing unknown objects and edges, 
and possibly violating the guarantees of the underlying own- 
ership system; 

c. Flow Analysis: 

Objects annotated with shared and unique are not currently 
shown in the OOG. Objects that are shared would be trivial to 
add but would add many uninteresting edges to the OOG. 
Objects that are unique would require a flow analysis to be 
handled properly. Usage edges (e.g., method invocations, 
field accesses) could be generated for a system with only 
ownership, but a flow analysis is required for usage edges to 
be sound in the presence of lent objects. 

Despite the assumptions about the whole program source 
being available and restrictions on reflection and dynamic 
loading, our system is still relatively sound in the presence of 
these features. In particular, as long as the reflective opera- 
tions are annotated correctly and consistently with ownership 
information, then any object referred to by some field in the 
source code that is available will show up in the OOG, as 
specified above. 

4 Advanced Features 

We motivate several advanced features of the analysis 
using a real system, JHotDraw, that we revisit in the evalua- 
tion. Section 5.1 highlights how we annotated JHotDraw. 

4.1 Instantiation-Based View 

Generics. 

Merging abstract objects within a domain based on sub- 
typing (due to possible aliasing) can sometimes lead to exces- 
sive merging. To increase the precision, we use generic types 
where applicable to minimize merging. Generic types on 
containers prevent the merging objects of type 
Vector<Handle> and those of type Vector<Figure>, com- 
pared to just using the raw type Vector. For our evaluation, we 
refactored all the subject systems in Section 5 to use generic 
types. Refactoring to generics has mature tool support [R. M. 
Fuhrer, F. Tip, A. Kiezun, J. Dolby, and M. Keller. Efficiently 
Refactoring Java Applications to Use Generic Libraries. In 
ECOOP, 2005.] and improves the precision of the extracted 
OOGs. 

Interface Declarations. 

JHotDraw uses inheritance heavily, whereby many types 
extend or implement listener interfaces to realize the 
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Observer design pattern. For instance, both interfaces Com- 
mand and Tool are in the Controller domain and both extend 
the interface ViewChangeListener. As a result, the abstract 
objects for Command and Tool get merged into the same 
runtime object. For some applications, the OOG in FIG. 14 5 
will be considered to suffer from too much merging and to be 
fairly imprecise. More specifically, FIG. 14 illustrates the 
JHotDraw OOG without using an instantiation-based view. 
Instances of Command and Tool are in the Controller domain 
and both interfaces Command and Tool extend the interface 10 
ViewChangeListener. As a result, the abstract objects for 
Command and Tool were merged into the same runtime 
object, Object 1412. In addition, objects of type Drawing- 
View and DrawingEditor were merged into Object 1414. 15 

A key insight however is that there are no object creations 
of interface types. To regain some component precision (as 
defined in Section 2.1), we construct the AbstractGraph dif- 
ferently to include abstract objects for object creations 
instead of field and variable declarations (See FIG. 13, Lines 20 
d,e). Line (e) generates a declaration-based view (DBV) and 
Line (f) an instantiation-based view (IBV). This technique is 
similar to how Rapid Type Analysis (RTA) determines the 
receiver of a method call during the construction of a call 
graph [D. F. Bacon and R F. Sweeney. Fast Static Analysis of 25 
C++ Virtual Function Calls. In OOPSLA, 1996.]. 

In the example above, the analysis never generates an 
abstract object of type 

ViewChangeListener. Rather, it creates abstract objects 
that are created with types SelectionTool and AlignCom- 3< 1 
mand. When constructing the runtime graph, AlignCommand 
and SelectionTool are kept distinct since there is no sub- 
typing relation between them, thus achieving the desired goal 
of keeping Command and Tool distinct. In JHotDraw, Selec- , 
tionTool <: Tool, ViewChangeListener and 

AlignCommand <: Command, ViewChangeListener. 

Virtual Objects. 

The analysis must still handle variable declarations of 
interface types. In JHotDraw, CommandMenu declares a 40 
Vector<Command>. This translates into a Command abstract 
object inside Vectors ELTS formal domain that stores the 
elements. The analysis cannot ignore these abstract objects as 
they must get pulled and carry the binding information to 
generate the appropriate RuntimeEdges. For instance, ELTS 45 
is transitively bound to Controller; after pulling the Com- 
mand abstract object from ELTS to Controller, the analysis 
creates a RuntimeEdge from the CommandMenu object 
inside View to the Command object inside Controller. The 
analysis cannot add a Command abstract object to domain 50 
ELTS either, since this would result in excessive merging. 

The analysis handles this case by creating a virtual abstract 
object, one that potentially gets pulled just like any other. But 
a virtual abstract object does not affect the object summari- 
zation since it is virtually excluded from the list of objects 55 
inside an AbstractDomain — except to prevent the analysis 
from re-adding it to that same domain. Finally, when creating 
the depth-limited projection of the runtime graph, the analysis 
omits the virtual abstract objects after they have served their 
purpose. 60 

Special Cases. 

Another corner case arises when the analysis encounters an 
object creation expression of the form new Object( ). Naively 
processing this expression would create an abstract object 
that would cause all the objects in that domain to merge into 65 
one runtime object. To avoid this problem, the analysis syn- 
thesizes for that abstract object an implicit anonymous class. 


22 

Evaluation of the Instantiation-Based View. 

FIGS. 15a and 15e-15/z illustrate a thumbnail of the JHot- 
Draw OOG based on object instantiations but without any 
abstraction. This OOG, even if it groups top-level objects into 
domains, lacks abstraction because it shows objects for 
RedoCommand, NewViewCommand, and OpenCommand, 
as well as objects for ConnectionTool, CreationTool, etc. 
What we really wanted is to merge all Command instances 
together and all Tool instances together, but not merge Tool 
and Command instances together. This OOG also suffers 
from an explosion of imprecise edges: since the analysis 
relies on only the aliasing information provided by the own- 
ership domains type system, it must add RuntimeEdges from 
CommandMenu to RedoCommand, NewViewCommand, 
etc., for soundness. Moreover, a Command wraps another 
Command, so this results in an almost fully connected graph. 

In the following section, we motivate and extricate the type 
abstraction from the core algorithm, but the rewriting rules 
already include it (See Rule R-Aux-Compat). The instantia- 
tion-based view mainly requires constructing the Abstract- 
Graph differently (See FIG. 13) but the transformation from 
the AbstractGraph to the RuntimeGraph stays mostly the 
same. 

4.2 Type Abstraction by Trivial Types 

To improve abstraction and reduce clutter in the OOG, we 
defined the following type abstraction heuristic: merge 
abstract objects whenever they share one or more non-trivial 
least upper bound types. The resulting runtime object has an 
intersection type that includes all the least upper bounds. 

Merging all the abstract objects in a domain into a single 
runtime object of type java.lang.Object would result in a 
sound but uninteresting OOG! So the type abstraction heu- 
ristic prevent the analysis from merging abstract objects that 
only share these trivial types as supertypes. The list of trivial 
types is user-configurable and can be empty. Typically, types 
such as Object and Serializable are trivial by default. FIG. 
15(6) shows the result of using type abstraction on the JHot- 
Draw OOG based on object instantiations with default trivial 
types. Again, this OOG suffers from the same excessive 
merging as not using the instantiation-based view. 

We can achieve better results for JHotDraw by carefully 
selecting the trivial types. In particular, when we turn on type 
abstraction and manually add ViewChangeListener as a 
trivial type, the analysis merges RedoCommand, NewView- 
Command, OpenCommand, etc., because they have a non- 
trivial least upper bound type Command. Similarly, Connec- 
tionTool, CreationTool, etc., get merged. But ConnectionTool 
and RedoCommand do not get merged because their only 
common supertype is ViewChangeListener. FIG. 15(c) shows 
the result of using type abstraction with the more precise 
trivial types. 

Finally, it is worth noting that since type abstraction leads 
to more merging, it does not compromise soundness (un- 
soundness would mean showing two separate components for 
the same runtime object). This heuristic can also be turned off 
by taking out the last disjunct in Rule R-Aux-Compat (FIGS. 
8a-8c). 

4.3 Type Abstraction by Design Intent Types 

Type abstraction by trivial types abstracts an OOG 
obtained from an instantiation-based view, as discussed 
above. In some cases, a more precise type abstraction method 
is needed. For instance, using the trivial types as discussed 
above, the JHotDraw OOG does not show distinct Drawing 
and Figure objects. Presumably, both Drawing and Figure are 
architecturally relevant. We examined the type hierarchy and 
learned that the base class implementing the Drawing inter- 
face, StandardDrawing extends CompositeFigure. Thus a 
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Drawing is-a Figure, hence they are merged for soundness, 
because they are both in the Model tier (Object 1500 in FIG. 
15(c)). 

Die other insight is that even though an object may have 
multiple types, typically some types are more architecturally 
relevant than others. For instance, class StandardDrawing 
extends class CompositeFigure and implements the Drawing 
interface. In turn, class CompositeFigure implements the Fig- 
ure interface. We may want to view a StandardDrawing object 
as a Drawing object not as a Figure object. To achieve this 
precision, we implemented type abstraction by design intent 
types, which works as follows: 

a. Flave a user define an ordered list of design intent types; 

b. To decide whether to merge two abstract objects o: t and 
o': t' 

c. Using the list of design intent types, 

c.l . Find a design intent type t, such that t<: t 

c.2. Find a design intent type t', such that t'<: t' 

c. 3. Merge objects o and o' if t<: t' ort'<: t 

d. If the design intent type list does not include a type for t 
or t', then this heuristic does not apply. 

Type abstraction by design intent types is an additional 
disjunct to Rule R-Aux-Compat, and can also be turned-off. 
For JHotDraw, the framework package includes classes and 
interfaces that define the drawing framework and does not 
provide any concrete implementation classes. We specified 
the list of design intent types to include all the types in the 
framework package, with a specific ordering (from most to 
least important). For instance, Drawing appears before Figure 
in the design intent types map. 

FIG. 15 d shows the result of using type abstraction with 
design intent types on the JHotDraw OOG based on object 
instantiations. There, we can see that objects of type Stan- 
dardDrawing and BouncingDrawing are merged with objects 
of type Drawing into one object (Object 1510). Objects of 
type AbstractFigure, CompositeFigure, and TriangleFigure, 
etc., are merged with objects of type Figure (Object 1520), 
hence keeping objects of type Drawing and Figure distinct in 
the Model domain. 

4.4 Mapping a Display Graph to a C&C Architecture 

A runtime architecture of a system is often represented as 
a standard component-and-connector (C&C) view, repre- 
sented in an Architecture Description Language (ADL). To 
increase the level of abstraction of the extracted runtime 
architecture, a separate analysis converts an OOG into a stan- 
dard C&C architecture. We illustrate the mapping to C&C 
views using the Acme general purpose ADL [Garlan, D., 
Monroe, R., and Wile, D. Acme: Architectural Description of 
Component-Based Systems. In Foundations of Component- 
Based Systems, Leavens, G. T., and Sitaraman, M., Eds., 
Cambridge University Press, 2000.]. This is just one example 
of a standard ADL, and most of the following Acme elements 
are also available in other ADLs: 

a. Component: unit of computation and state; 

b. Port: point of interaction on a Component; 

c. Connector: interaction among components; 

d. System: configuration of Components and Connectors; 

e. Representation: nested sub -architecture; 

f. Group: named grouping of elements, such as an archi- 
tectural runtime tier; 

g. Property: a (name, value) pair associated with an ele- 
ment. 

Die mapping to a C&C architecture works as follows. 

A. The root object maps to a System; 

B. Each object in the OOG maps to a Component; 

C. References between objects create Ports as follows. 

If object A has a field reference of type T to object B: 
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a. The corresponding Component A has a Port that requires 
services and has name B; 

b. The Component corresponding to B has a Port that 
provides services and has name T; 

5 c. A Connector connects A to B; 

d. Architecturally uninteresting self-edges can be 
excluded. 

D. An ownership domain d in the OOG maps to a Group g. 

a. If an object o in a domain d, the corresponding Compo- 

10 nent is in Group g. 

E. The OOG hierarchy maps to system decomposition in 
Acme. 

a. If an OOG object declares domains, the corresponding 

I5 Component has a Representation; 

b. The corresponding Component lias a sub -architecture 
inside that Representation. 

F. Finally, the analysis populates the details property for an 
element with traceability information from the OOG, that 

20 other tools can use to trace from a element in the C&C view 
to the code. 

FIG. 23 shows the mapping between an OOG and a C&C 
view represented in the Acme Architecture Description Lan- 
guage. 

25 We section presented one set of possible rules to map an 
OOG to a C&C view that an automated analysis can apply. 
There are other modeling choices, including giving a devel- 
oper greater control of the mapping. For instance, a developer 
may merge two components in the OOG into one component 

11 1 in the C&C view. It is also possible to map an entire domain 
to a Component. It is also possible to have the user map the 
implementation types to architectural types, to generate an 
as-built view with richer types, styles and properties [M. 

35 Abi-Antoun, .1. Aldrich, N. Nahas, B. Schmerl, and D. Garlan. 
Dif-ferencing and Merging of Architectural Views. Auto- 
mated Softu’. Eng., 15(8), 2008.]. 

In follow-up work, we check and measure the structural 
conformance of the extracted C&C view with an as-designed 

40 architecture [M. Abi-Antoun and J. Aldrich. Checking and 
Measuring the Architectural Structural Conformance of 
Object-Oriented Systems. CMU-ISRI-07-119, 2007.]. The 
present invention focuses on architectural extraction, 
although the present invention is not so limited. 

45 4.5 Analyzing a Computer Program Module 

The present invention allows analyzing a computer pro- 
gram module, understood to mean part or all of a computer 
program. 

Stop-Analysis Configuration Files. 

50 An ownership type system often makes a closed world 
assumption. This requires analyzing all the classes that are 
transitively referenced (through constructor calls, field refer- 
ences, etc.), including classes in external library code. It is 
often desirable to analyze a computer program module. To 

55 this effect, we define a stop-analysis configuration file to 
allow the user to specify the list of classes, packages or 
compilation units that must not be analyzed. The user to 
exclude zero or more entire packages, zero or more types or 
zero or more compilation units, using regular expressions, as 

60 follows: 

ExcludePackagemameregex // Exclude name package 
ExcludeTypemameregex // Exclude type 
ExcludeCompUnit:Test*.java // Exclude compilation units 
for unit tests 

65 This technique is also supported by other tools such as 
Womble [D. Jackson and A. Waingold. Lightweight Extrac- 
tion of Object Models from Bytecode. TSE, 27(2), 2001.]. 
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Annotations. 

The annotations can be stored in the computer program 
files using language support for annotations or as comments. 
They can also be stored in external files. The present embodi- 
ment provides tool support for associating ownership domain 
annotations with any Java source file or Java bytecode file 
using an external file, 

Typically, we add the ownership information directly as 
annotations to the files that are part of the computer program 
modules that are being analyzed. And we store in external 
files any annotations for the computer program modules that 
are not analyzed using. 

Library Code. 

There are two approaches to support adding annotations to 
the standard Java libraries and other third-party libraries. The 
first involves annotating the sources of the library (when 
available) and pointing the tool to the annotated sources. The 
second involves placing the annotations in external files. The 
tool can support both methods. The latter approach is often 
preferable since it does not require changing the library or 
third-party code — which may not be available and when it is, 
tends to evolve separately. Other annotation based systems 
also support storing the annotations in external files [M. D. 
Ernst and D. Coward. JSR 308: Annotations on Java types. 
http://pag.csail.mit.edu/jsr308/, 2006.]. 

Virtual Fields. 

When parts of a computer program are excluded from the 
analysis, the annotation must soundly summarize the parts 
that are not being analyzed, in order to preserve the soundness 
of the extracted architecture. 

If the user of the tool knows about fields in missing code or 
external library code, and would like to take them into con- 
sideration, he can use virtual fields 

[McCamant, S., and Ernst, M. D. Early Identification of 
Incompatibilities in Multi-component Upgrades. In ECOOP, 
2004.] or ghost variables [Flanagan, C., Leino, K. M., Lillib- 
ridge, M., Nelson, G., Saxe, J. B., and Stata, R. Extended 
Static Checking for Java. In PLDI, 2002.]. A virtual field 
holds information that is closely related to the meaning of an 
object, but need not be kept directly in the object in a particu- 
lar implementation [ME04]. These annotations do not affect 
the execution of the system at runtime but are treated as an 
object’s actual fields by the analysis. 

5 Evaluation 

We evaluated the quality of the extracted OOGs on several 
extended examples of medium-sized representative programs 
to answer the following research questions: 

Can an OOG have a meaningful level of abstraction (or 
does it suffer from too much or too little merging)? 

Based on an OOG, can a developer leam what annotations 
she can adjust to get a desired architectural view? 

Can an automatically extracted OOG be related to a runt- 
ime architecture manually drawn by a developer? 

Methodology. 

The tool support for architectural extraction consists of two 
Eclipse plugins. JavaDomains is a typechecker to validate 
ownership domain annotations that a developer inserts as Java 
1 .5 annotations [M. Abi-Antoun and J. Aldrich. Ownership 
Domains in the Real World. In IWACO, 2007.] . OOG Wizard 
is a plugin to extract an OOG. It allows the developer to select 
the projection depth, elide substructure on selected objects 
and set the trivial types. FIGS. 16o-16e are snapshots of the 
Ownership Object Graph Tool as of this writing. The left pane 
shows the ownership tree and the right pane shows the depth- 
limited projection. The tool tracks the abstract objects that are 
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merged into a runtime object. Since abstract objects are asso- 
ciated with elements of the program’s Abstract Syntax Tree 
(AST), this allows tracing from an element in the OOG to the 
corresponding variable declarations in the code. The slider 
5 control sets the projection depth. The tool allows the user to 
set the type abstraction property using trivial types or design 
intent types. The tool displays a static type hierarchy of the 
types of the abstract objects that are merged into a runtime 
object to assist the user with selecting the trivial types and the 
to design intent types. Finally, the tool can generate a compo- 
nent-and-connector runtime architecture. The OOG is laid 
out automatically using GraphViz [E. R. Gansner and S.C. 
North. An Open Graph Visualization System and its Applica- 
tions to Software Engineering. Softw. Practice & Exp., 
15 30(11), 2000.]. 

The study’s subject (one of us, hereafter “we”) developed 
the OOG Wizard but none of the subject systems. He mostly 
learned their architectural structure from iteratively annotat- 
ing the code, examining the extracted OOGs and relating the 
20 OOGs to diagrams of the code architecture drawn by others. 
For one system (JHotDraw), he had access to a tutorial by the 
original designers, but fora slightly older version than the one 
he annotated. The tutorial discusses the design patterns that 
JHotDraw implements but does describe the system’s runt- 
25 ime architecture. We previously studied another subject sys- 
tem (HillClimber) by re-engineering it to ArchJava. The re- 
engineering case study also produced a version that cleaned 
up the original code, for instance by making most class fields 
as private. For tins case study, we started from the refactored 
30 Java version and added ownership domain annotations to it. 

We discuss in detail the annotation process of the subject 
systems elsewhere. In addition, we discuss the differences 
between adding ownership annotations to the plain Java pro- 
gram, compared to the re-engineered ArchJava program, for 
35 the HillClimber subject system [M. Abi-Antoun and J. Aid- 
rich. Ownership Domains in the Real World. In IWACO, 
2007.]. 

5.1 JHotDraw 

JHotDraw is rich with design patterns, uses composition 
40 and inheritance and has evolved through several versions. 
Version 5.3 has 200 classes and 15,000 lines of Java. 

Design documentation for JHotDraw is available, e.g., [E. 
Gamma. Advanced Design with Patterns and Java (Tutorial). 
In JAOO, 1998. JHotDraw version 5.1.], [D. Riehle. Frame- 
45 work Design: a Role Modeling Approach. PhD thesis, 2000.], 
[W. Kaiser. Become a Programming Picasso with JHotDraw. 
Java World, 2001]. FIG. 17 is amanually generated JHotDraw 
class diagram showing how we annotated instances of the 
selected types (Source: [D. Riehle. Framework Design: a 
50 Role Modeling Approach . PhD thesis, 2000.]). The class dia- 
gram in FIG. 17 shows some of the core types. An often cited 
article [W. Kaiser. Become a Programming Picasso with 
JHotDraw. Java World, 2001.] discusses how JHotDraw fol- 
lows the Model-View-Controller design pattern (the package 
55 structure does not reveal that fact since all the types in FIG. 17 
are in one framework package). 

Annotations. 

We defined the following three top-level domains and orga- 
nized instances of the core types as follows: 

60 Model: has instances of Drawing, Figure, Handle, etc. A 
Drawing is composed of Figures. A Figure has Handles for 
user interactions; 

View: DrawingEditor, DrawingView, etc., instances; 

Controller: has instances of Tool, Command and Undo- 
65 able. A DrawingView uses a Tool to manipulate a Drawing. A 
Command represents an action to be executed — the Com- 
mand pattern without undo. 
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Evaluation. 

In Section 4, we showed some earlier OOGs we extracted 
for JHotDraw. The OOG is in FIG. 14 suffered from too much 
merging: it merged DrawingView and DrawingEditor and 
made it hard to recognize many of the key abstractions from 5 
tlie class diagram in FIG. 17. Moreover, the OOG in FIG. 15 
was unreadable and required type abstraction. 

Type Abstraction by Trivial Types. 

The user selects the trivial types as follows. First, he 
graphically selects an object which appears to merge too to 
many objects. The OOG Wizard then displays an inheritance 
hierarchy of the types of the abstract objects that are merged 
into the selected object. The general principle is that the user 
must select a type that would cut the path from an interesting 
leaf type in the inheritance hierarchy up to an uninteresting 15 
common ancestor. 

For JHotDraw, we started with the list of default trivial 
types, which includes several interfaces from the Java Stan- 
dard Library such as Serializable, Cloneable, etc. Many of 
these types are marker interfaces that do not declare any 20 
methods. Others are constant interfaces that only define con- 
stants, such as SwingConstants (inheriting from a constant 
interface is a bad coding practice, the Constant Interface 
antipattem [J. Bloch. Effective Java. Addison- Wesley, 2001, 
Item #17], and Java 1.5 supports static imports for this pur- 25 
pose.). JHotDraw had its own list of interfaces that many 
classes implement such as Storable and Animatable. We also 
included several listener interfaces, such as ViewChangeLis- 
tener (as discussed in Section 4.1). Because of JHotDraw’s 
complex inheritance hierarchy, the list of trivial types needed 30 
fine-tuning to achieve the desired level of abstraction — more 
so than the other subject systems. We did not use type abstrac- 
tion on the last subject system (Section 5.3). 

Riehle previously studied JHotDraw and produced the 
code architecture in FIG. 17. Riehle posited that the original 35 
JHotDraw designers used the following techniques to present 
the JHotDraw design in their tutorials: a) merge interface and 
abstract implementation class — although important for code 
reuse, such a code factoring is often unimportant from a 
design standpoint; and b) subsume a set of similar classes 40 
under a smaller set of representative classes — showing many 
similar subclasses that vary only in minor aspects often leads 
to needless clutter [D. Riehle. Framework Design: a Role 
Modeling Approach. PhD thesis, 2000., pp. 139-140]. 

The OOG type abstraction seems to produces similar 45 
results to the above heuristics. For instance, all runtime 
Handle objects referenced in the program by the Handle 
interface, its abstract implementation class AbstractHandle, 
or any of its concrete subclasses ElbowHandle, NullHandle, 
etc., appear as one Handle component in the Model tier. Type 50 
abstraction can sometimes lead to a precision loss: not all 
Handle classes have a field reference to a Locator as FIG. 17 
indicates. Only NullHandle and its subclasses do. But since 
they were all merged into Handle, the OOG shows an edge 
from Handle to Locator in FIG. 186. FIGS. 18a-18<? illustrate 55 
the top-level OOG for the JHotDraw subject system. There 
are three top-level domains: Model, View and Controller. 
Object Figure merges objects of type Figure, TriangleFigure, 
etc. Because a Drawing is implemented as a Figure, object 
Figure also merges objects of type Drawing, StandardDraw- 60 
ing, etc. The DrawingView interface extends the Draw- 
ingChangeListener interface. Hence the edge from object 
fListeners inside object Figure to the DrawingView object. 
Inside object Figure, object fFigures contains the composite 
Figure objects. Object Handle merges objects of type 65 
NullHandle, GroupHandle, etc. Point objects are i mmu table 
and passed linearly, hence they do not appear in the OOG. 
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We were slightly surprised when we inadvertently added 
interface Handle as a trivial type. This resulted in an OOG 
with one object for NullHandle (which directly implements 
Handle) and another object for all instances of the concrete 
subclasses that implement Handle by extending 
AbstractHandle. While this result seemed counter-intuitive, 
that OOG was sound: there is no runtime object that can have 
both types NullHandle and AbstractHandle, so one runtime 
object does not appear as two in the OOG. 

Potential Design Flaw. 

We were surprised that the OOG did not show distinct 
Drawing and Figure objects, presumably core types in the 
class diagram in FIG. 17. 

We used the tool to determine that one object in the Model 
domain merged both Drawing and Figure. We examined the 
type hierarchy and learned that the base class implementing 
the Drawing interface, StandardDrawing extends Composite- 
Figure. Thus a Drawing is-a Figure. We researched this find- 
ing and found a brief mention in the Version 5.1 Release 
Notes. Still, in the framework package, interface Drawing 
does not extend Figure! We then checked the JHotDraw tuto- 
rial. Indeed, the JHotDraw designers explicitly asked to “not 
commit to the CompositeFigure implementation since some 
applications need a more complicated representation” [E. 
Gamma. Advanced Design with Patterns and Java (Tutorial). 
In JAOO, 1998. JHotDraw version 5.1., Slide #16], 

OOG= Architecture? 

The OOG in FIGS. 18o-18ci seem to have the right level of 
abstraction since we recognize in it most of the core types 
from FIG. 17. 

A key issue in architectural extraction is distinguishing 
between architecturally relevant and non-architecturally rel- 
evant objects. The OOG provides architectural abstraction by 
folding lower-level objects into higher-level architectural 
components. As a result, the OOG does not show non-archi- 
tecturally relevant objects in the top-level domains. Collaps- 
ing many nodes into one is a classic approach to shrink a 
graph. However, the OOG statically collapses nodes based on 
the actual execution and ownership structure, and not accord- 
ing to where objects were declared in the program, or accord- 
ing to some naming convention. 

There are two ways to control the level of detail. One is to 
control the OOG projection depth, which affects the depth of 
object substructure uniformly for all objects starting from the 
root of the ownership tree using the slider control in the OOG 
Wizard in FIG. 16c. Because one object’s substructure may 
be more interesting than that of some other object, the OOG 
Wizard tool allows the user to collapse the internals of a 
selected object; in that case, the tool appends the (+) symbol 
to that object’s label. In FIGS. 18a-18t/, we manually elided 
the substructure of all the objects in the top-level domains 
except for Drawing because we wanted to highlight the Com- 
posite pattern. Inside Drawing, the owned domain shows 
several objects. We recognize a Vector<Figure>, fFigures, 
that maintain the list of sub-figures, and a summary edge from 
fFigures to figure:Figure in Model. 

5.2 HillClimber 

By many accounts, JHotDraw is the brainchild of object- 
oriented analysis and design (GOAD) experts. The second 
subject system HillClimber is a 15,000 line Java application 
that was developed by undergraduates. HillClimber is also 
interesting because it uses a framework and its architectural 
structure had degraded over the years [M. Abi-Antoun and J. 
Aldrich. Ownership Domains in the Real World. In IWACO, 
2007.]. Our goal was to evaluate the OOG of a program that 
was not well-designed by GOAD experts. In HillClimber, the 
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application window uses a canvas to display nodes and edges 
of a graph to show the output of a computational engine. 

Annotation. 

Hie ownership annotations organized objects into a data 
domain to store the graph, a ui domain to hold user interface 
objects, and a logic domain to hold the engine, search objects, 
and associated objects. While adding annotations to Hill- 
Climber, we refactored the code to reduce coupling between 
ui and data objects [M. Abi-Antoun and J. Aldrich. Owner- 
ship Domains in the Real World. InIWACO, 2007.]. 

Evaluation. 

FIG. 19 illustrates the top-level OOG for the HillClimber 
subject system. There are three top-level domains, dataTier, 
userTier and logicTier. The objects in the top-level domains 
are collapsed, except for object search inside the logicTier 
domain. Object search has a HEURISTICS public domain 
with two array objects inside it. Object heuristics which is a 
peer of object search inside logicTier accesses those array 
objects directly. Such an edge 1912 would not be possible in 
a strict owner-as-dominator type system. The tool shows the 
abstract objects merged into a runtime object (FIG. 16b). We 
used that information to learn what abstract objects in the 
program required different annotations. We refined those 
annotations using the following two strategies. 

Strategy #1: Use Encapsulation. 

We reduced the clutter in the dataTier by pushing more 
objects into private domains of other objects. For instance, we 
placed heap:HillHeap inside a private domain of graph: Hill- 
Graph. We also pushed several Vectors into private domains 
and ensured that the other references to them were unique 
(they were actually passed linearly between objects). In a few 
cases, we changed the code to prevent representation expo- 
sure by returning a copy of an internal list instead of an alias. 

Strategy #2: Use Logical Containment. 

We defined public domains to reduce the number of top- 
level objects. A public domain groups related objects, pushes 
the inner objects it contains down the ownership tree and 
removes them from the top-level domains, while keeping 
those inner objects accessible to objects that can access the 
outer object. Object search has a HEURISTICS public 
domain with two array objects inside it; its peer object heu- 
ristics inside logicTier accesses those array objects directly. 
Note, such an edge that would not be possible in an owner- 
as-dominator type system [D. Clarke, J. Potter, and J. Noble. 
Ownership Types for Flexible Alias Protection. In OOPSLA, 
1998.]. 

5.3 Aphyds 

Aphyds is an 8,000 line circuit layout application that 
Aldrich et al. studied previously [J. Aldrich, C. Chambers, 
and D. Nothin. ArchJava: Connecting Software Architecture 
to Implementation. In ICSE, 2002.]. Aphyds follows the 
Document-View style where the views are user interface 
objects, and the model consists of a circuit and computational 
objects to partition and route the circuit. 

Annotations — Round 1. 

We initially organized the Aphyds objects into two top- 
level domains: 

UI: containing a Circuit Viewer object and several subsid- 
iary user interface objects; 

Model: holds a Circuit object and a set of computational 
objects that act on it, such as Floorplamier and Partitioner. 

Evaluation — Round 1. 

These annotations produced an OOG with too many com- 
ponents in the top-level domains (FIG. 20(a)). 

FIG. 20a shows the top-level OOG for the Aphyds subject 
system. There are two top-level domains, UI and Model. This 
OOG is not very comparable to the diagram drawn by the 
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original Aphyds developer (FIG. 1). Note, in particular, that 
objects Circuit 2000, Net 2010, Terminal 2020 and Node 
2030 are all at the same level. 

Annotations — Round 2. 

5 We examined the OOG and determined which runtime 
objects needed to be pushed down the hierarchy. Using the 
tool, we learned what abstract obj ects declared in the program 
required different annotations. For instance, we needed to 
push Net and Node objects underneath Circuit. Using FIG. 1 
to as a guide, we created public domains as follows: 

A. CircuitViewer.DISPLAY: a public domain on the Circuit- 
Viewer to hold a Displayer object that all the other UI objects 
had references to; 

B. Circuit. DATABASE: a public domain on the Circuit 
15 object, that includes the objects associated with the Circuit, 

such as Net, Node, and Terminal; 

C. Partitioner.DATABASE: a public domain on Partitioner to 
hold PartitionTranscript and PTnode objects; 

D. Floorplaimer.DATABASE: a public domain on Floorplan- 
20 ner for the floorplanning objects, such as SlicingTree; 

E. GlobalRouter.DATABASE: a public domain on Global- 
Router to hold NetGlobalRouting objects. 

Just as with HillClimber, we also reduced the clutter by 
pushing objects such as Vector<Floorplan> into private 
25 domains or by passing them linearly between objects. 
Evaluation — Round 2. 

The Aphyds OOG with the revised annotations is in FIG. 
20(7>). FIG. 20 b illustrates the top-level OOG for the Aphyds 
subject system. There are two top-level domains, UI and 
30 Model. Many objects that were in the Model domain were 
moved into public domains of other objects in the Model 
domain, such as Channel, GlobalRouter, Partitioner and Cir- 
cuit. Those public domains are elided except for the Circuit 
object. Inside object Circuit 2000, public domain DATA- 
35 BASE has Node 2030, Net 2010 and Terminal 2020 objects 
inside it. The owned domain inside Circuit stores Hashtable 
objects. We also collapsed most substructures except for the 
Circuit object. Indeed, this OOG is very comparable to the 
developer diagram of the Aphyds runtime architecture (FIG. 
40 1). For instance, viewer, circuit and fp in the OOG map to 
circuit Viewer, Circuit and FloorPlamier in FIG. 1. Objects 
Node, Net in the DATABASE public domain map to node and 
net inside circuit’s sub -architecture in FIG. 1. 

As to be expected from a manually generated diagram, 
45 FIG. 1 omitted several edges between UI and Model objects 
as well as edges between objects in the Model tier. The OOG 
has an additional object in UI, Part Dialog, which connects to 
the Partitioner. Upon a closer examination of the OOG, we 
noticed a reference from placer in Model to PlacerDialog in 
50 UI. This was a potential red flag since Aphyds is a multi- 
threaded application: a worker thread executing long running 
operations cannot carelessly call back into the user interface 
thread. The traceability information in the OOG helped us 
relate this callback to a field of type PlaceRouteDialog 
55 declared in class Placer. We did verily however that the code 
correctly handled the callback. 

Discussion. 

When Aldrich et al. re-engineered Aphyds to specify its 
architecture in ArchJava, they used component classes to 
60 create the hierarchy [J. Aldrich, C. Chambers, and D. Nothin. 
ArchJava: Connecting Software Architecture to Implementa- 
tion. In ICSE, 2002.]. Here, we used public domains to create 
logical conta inm ent. The information we gleaned from the 
OOG is consistent with what Aldrich et al. found [J. Aldrich, 
65 C. Chambers, and D. Nothin. ArchJava: Connecting Software 
Architecture to Implementation. In ICSE, 2002.], but did not 
require re-engineering the application. There was a mismatch 
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between the edges that the developer diagram intended to 
show and the ones that the OOG currently shows field refer- 
ences). We plan to add control and data flow edges to the 
OOG. 

6 Discussion 

Field Studies. 

We have conducted additional evaluation in the form of a 
field study where we used the present invention to analyze a 
30 KLOC module of a commercial proprietary computer 
software program totaling 250 KLOC. 

As expected, the Java program was currently under active 
development, and already used Java 1 .5 generics, so we did 
not have to refactor the code. Using the present invention, we 
were able to extract meaningful runtime architectures of the 
computer program module in under 40 hours. Most of the 
time was spent adding annotations to the computer program 
module. 

Performance. 

The following table measures the execution time of the 
static analysis on several subject systems. The OOG time 
includes parsing the program’s abstract syntax tree to retrieve 
the annotations, build the abstract graph, convert it into a 
runtime graph, and then into a display graph. Note that the 
IBV time is lower because when using the instantiation-based 
view, there are fewer abstract objects that the analysis must 
manipulate. 

Table 1: OOG measures the extraction time on an Intel 
Pentium 4 (3 GHz) with 2 GB of memory. WARN is the 
remaining annotation warnings. IBV indicates if the instan- 
tiation-based view was used. TABS indicates if type abstrac- 
tion were used. 


System 

LOC 

OOG 

TABS 

IBV 

WARN 

JHotDraw 

15,000 

2’18" 

No 

No 

60 

JHotDraw 

15,000 

0'16" 

Yes 

Yes 

60 

HillClimber 

15,000 

0'26” 

No 

No 

42 

HillClimber 

15,000 

0'09" 

Yes 

Yes 

42 

Aphyds 

8,000 

0'24" 

No 

No 

72 


Annotation Overhead. 

We currently add the ownership domains annotations 
manually so we used medium-sized programs for the evalu- 
ation. The annotation cost could be reduced by ownership 
inference [J. Aldrich, V. Kostadinov, and C. Chambers. Alias 
Annotations for Program Understanding. In OOPSLA, 
2002.], [K.-K. Ma and J. S. Foster. Inferring Aliasing and 
Encapsulation Properties for Java. In OOPSLA, 2007.] and 
amortized over other uses of ownership, e.g., for program 
verification [Y. Lu, J. Potter, and J. Xue. Validity Invariants 
and Effects. In ECOOP, 2007.]. On the other hand, the anno- 
tations allow the developers to obtain meaningful architec- 
tural abstractions, ratherthan automatically inferred ones that 
are likely to be poor matches to the desired software archi- 
tecture. 

For the proposed approach to be adoptable, better owner- 
ship inference must be developed. Existing ownership infer- 
ence tools adopt restrictive notions of ownership, do not map 
their results to a type system, do not infer ownership param- 
eters [K.-K. Ma and J. S. Foster. Inferring Aliasing and 
Encapsulation Properties for Java. In OOPSLA, 2007.] or 
infer imprecise ones [ J. Aldrich, V. Kostadinov, and C. Cham- 
bers. Alias Annotations for Program Understanding. In OOP- 
SLA, 2002.]. 
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Annotations vs. Language Extensions. 

Adding ownership annotations to an existing program is 
easier than re-engineering it to ArchJava. Both authors pre- 
viously re-engineered existing systems to ArchJava. The 
5 restrictions that ArchJava imposes on existing programs are 
fairly onerous. For instance, ArchJava prohibits returning 
references to instances of component classes when most 
object-oriented code returns object references liberally. In 
addition, ArchJava does not allow a component class to have 
111 public fields. In contrast, it is possible to annotate such fields 
with public domains (private domains cannot be used, of 
course). 

For Aphyds, Aldrich et al. converted dynamic connections 
15 into static ones and inadvertently injected several defects to 
produce a system with over 20 components and 80 “ports” in 
the code [J. Aldrich, C. Chambers, and D. Nothin. ArchJava: 
Connecting Software Architecture to Implementation. In 
ICSE, 2002.]. Aldrich et al. previously added ownership 
20 annotations to a subset of the Aphyds subject system (around 
3,500 lines) in “three hours and 40 minutes — less than a 
quarter of the time that it took the same programmer to 
express the control-flow architecture of the same part of Aph- 
yds” in ArchJava [J. Aldrich, C. Chambers, and D. Nothin. 
25 ArchJava: Connecting Software Architecture to Implementa- 
tion. In ICSE, 2002.]. Similarly, when we re-engineered Hill- 
Climber, we initially introduced runtime exceptions due to 
disconnected ports and changed the application’s initializa- 
tion order [M. Abi-Antoun, J. Aldrich, and W. Coelho. A Case 
11 1 Study in Re-engineering to Enforce Architectural Control 
Flow and Data Sharing. JSS, 80(2), 2007.]. 

What are “Good” Annotations? 

Just as there are multiple architectural views of a system, 
35 there is no single right way to amiotate a program. Good 
annotations minimize the number of top-level components in 
the OOG by pushing more objects underneath other objects. 
The best annotations produce an OOG that is comparable to 
what an architect might draw for the as-designed architecture, 
40 as long as the program supports them! 

The OOG can guide a developer to refine the annotations. 
For JHotDraw, we initially placed Handle instances in the 
Controller domain and later moved them to the Model domain 
since Handle is related to Figure. 

45 How is the Process Iterative? 

Obtaining an OOG is a semi -automated iterative process, 
which involves the following steps: 

A. Decide on the right top-level domains; 

B. Decide on the right top-level object in the top-level 
50 domains. This may involve moving objects between the top- 

level domains; 

C. Achieve an adequate number of objects in each top-level 
domain: 

C(a). Push secondary objects underneath primary objects, 
55 using: 

C(a)(i). Strict encapsulation (private domains); 

C(a)(ii). Logical containment (public domains); 

C(b). Pass low-level objects linearly between objects; 

C(c). Use type abstraction to merge fewer or more objects 
60 in each domain, using: 

C(c)(i). Trivial types; 

C(c)(ii). Design intent types; 

D. Achieve an appropriate level of visual detail: 

D(a). Hide or show the substructure of a selected object; 
65 D(b). Change the projection depth; 

The tool adds any summary edges corresponding to the 
elided substructure. 
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Why was One System Refactored? 

Adding the annotations to HillClimber highlighted refac- 
toring opportunities [M. Abi-Antoun and J. Aldrich. Owner- 
ship Domains in the Real World. In IWACO, 2007.], [M. 
Abi-Antoun, J. Aldrich, and W. Coelho. A Case Study in 5 
Re-engineering to Enforce Architectural Control Flow and 
Data Sharing JSS, 80(2), 2007.]. Without refactoring, the 
annotations would enforce a degraded architecture. We did 
not refactor JHotDraw or Aphyds, except to use generic types. 
These were code bases that were developed prior to Java 1.5. to 
Most Java code is being refactored to use generics using 
available tool support [R. M. Fulirer, F. Tip, A. Kiezun, J. 
Dolby, and M. Keller. Efficiently Refactoring Java Applica- 
tions to Use Generic Libraries. In ECOOP, 2005.]. When 
adding annotations, one must choose between enforcing a 15 
degraded architecture or refactoring to reduce tight coupling, 
e.g., by programming to an interface, or introducing a media- 
tor object [M. Abi-Antoun and J. Aldrich. Ownership 
Domains in the Real World. InIWACO, 2007.]. E.g., in Hill- 
Climber, the mediator object was introduced during a refac- 20 
taring. 

Why Ownership Domains? 

The approach was presented in terms of the ownership 
domains type system, where each object contains one or more 
public or private domains, and each object is in exactly one 25 
domain. In principle, the approach also applies to ownership 
type systems that assume a single context per object [D. 
Clarke, J. Potter, and J. Noble. Ownership Types for Flexible 
Alias Protection. In OOPSLA, 1998.]. However, inanowner- 
as-dominator type system, any access to a child object must 30 
go through its owning object (D. Clarke, J. Potter, and J. 
Noble. Ownership Types for Flexible Alias Protection. In 
OOPSLA, 1998.]. In contrast, the ownership domains type 
system supports pushing any object underneath any other 
object in the ownership hierarchy: a child object may or may 35 
not be encapsulated by its parent object. A child object can 
still be referenced from outside its owner if it is part of a 
public domain of its parent, or if a domain parameter is linked 
to a private domain [J. Aldrich and C. Chambers. Ownership 
Domains: Separating Aliasing Policy from Mechanism. In 40 
ECOOP, 2004.]. This expressiveness makes it possible to 
avoid an architecture that has too many top-level objects, as in 
the first Aphyds OOG (FIG. 20(a)). If making an object 
owned by another object restricts access to the owned object, 
this forces more objects to be peers. 45 

Dynamism. 

The OOG is an approximation of the actual runtime archi- 
tecture, one that is conservative and may include more than 
actually will be there by virtue of using a sound static analy- 
sis. However, the experimental evidence we have gathered on 50 
several extended examples, as well as many other smaller 
examples, indicates that the extracted architectures do not 
suffer from too much or too little abstraction. In comparison, 
Rayside et al. reported that a static object graph analysis 
based on RTA produced unacceptable over-approximations 55 
for most non-trivial programs [D. Rayside, L. Mendel, R. 
Seater, and D. Jackson. An Analysis and Visualization for 
Revealing Object Sharing. In Eclipse Technology eXchange 
( ETX ), 2005.]. Finally, the approach currently describes a 
static component-and-comiector architecture of a system, but 60 
offers no facilities for specifying runtime architectural 
changes [P. Oreizy, N. Medvidovic, and R. N. Taylor. Archi- 
tecture-Based Runtime Software Evolution. 1 n ICSE, 1998.], 
as in dynamic architecture description languages. As a result, 
approach does not address dynamic architectural reconfigu- 65 
ration [J. Magee and J. Kramer. Dynamic Structure in Soft- 
ware Architectures. In FSE, 1996 .]. 


Overview of Apparatus and Methods Embodiments 

There are many variations and embodiments of the present 
invention. This section will discuss some, but not all, embodi- 
ments and variations of the invention. Several embodiments 
and variations of the apparatuses of the present invention will 
be discussed with regard to FIG. 21 , and several embodiments 
and variations of the methods of the present invention will be 
discussed with regard to FIG. 22. 

FIG. 21 illustrates one embodiment of a system 2110 
according to the present invention. In that embodiment, the 
system 2110 includes a processor 2112, memory 2114, an 
input device 2116, and an output or display device 2118, such 
as a monitor. The processor 2112 is connected to the memory 
2114, the input device 2116, and the output device 2118. The 
memory 2114 includes computer readable instructions, such 
as computer hardware, software, firmware, or other forms of 
computer-readable instructions which, when executed by the 
processor 2112, cause the processor 2112 to perform certain 
functions, as described herein. 

The processor 2112 performs certain functions, as 
described herein. The processor 2112 may also receives input 
from the input device 2116 (such as fromauser, from another 
computer, or from some other source), and provides signals to 
control the output device 2118. 

The memory 2114 can be any for of computer-readable 
memory, and may store information in magnetic form, optical 
form, or other forms. The memory includes computer read- 
able instructions which, when executed by the processor 
2112, cause the processor 2112 to perform certain functions, 
as described herein. The memory 21 14 may be separate from 
the processor 2112, or the memory 2114 may be integrated 
with the processor 2112. The memory 2114 may also include 
more than one memory device, which may be integrated with 
the processor 2112, separate from the processor 2112, or 
both. 

The input device 2116 may be a keyboard, a touchscreen, a 
computer mouse, or other forms of inputting information 
from a user. 

The output device 2118 may be a video display or other 
forms of outputting information to a user. Many variations are 
possible with the system 2110 according to the present inven- 
tion. For example, more than one processor 2112, memory 
2114, input device 2116, and output device 2118 may be 
present in the system 2110. In addition, devices not shown in 
FIG. 1 may also be included in the system 2110, and devices 
shown in FIG. 2 1 may be combined or integrated together into 
a single device, or some devices may be omitted. 

For example, the present invention may be embodied as a 
computer, a plurality of computers or a computer system, 
computer-readable instructions, or other variations thereof. 
In one embodiment the present invention is an apparatus 
including a processor 2212 and memory 2114. The memory 
includes computer readable instructions which, when 
executed by the processor, cause the processor to create an 
abstract graph from the computer program module and from 
containment information corresponding to the computer pro- 
gram module, wherein the abstract graph has nodes including 
types and objects, and wherein the abstract graph relates an 
object to a type, and wherein for a specific object the abstract 
graph relates the specific object to a type containing the 
specific object. The computer readable instructions also cause 
the processor to create a runtime graph from the abstract 
graph, wherein the runtime graph is a representation of the 
true runtime object graph, wherein the runtime graph repre- 
sents containment information such that, for a specific object, 
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the runtime graph relates the specific object to another object 
that contains the specific object. 

In another embodiment, the present invention is embodied 
as computer readable instructions which, when executed, 
cause a processor to perform the steps, or variations of the 
steps, described above. 

Many variations are possible for the computer readable 
instructions, such as computer readable instructions embod- 
ied on optical media, computer readable instructions embod- 
ied on magnetic media, and computer readable instructions 
embodied in solid state memory. 

FIG. 22 is a flow chart illustrating one embodiment of the 
method 2200 according to the present invention. According to 
that embodiment, the method 2200 includes creating an 
abstract graph 2210, creating a runtime graph 2212 from the 
abstract graph, and creating a display graph 2214 from the 
runtime graph. However, in some embodiments, the present 
invention may omit the step of creating the display graph 
2214, and include only creating the abstract graph 2210 and 
creating the runtime graph 2212. In addition, the C&C archi- 
tecture 2216 can be generated from the runtime graph 2212 or 
from the display graph 2214. Furthermore, the method, and 
the steps of the method, may include additional steps, limita- 
tions, and other modifications beyond those illustrated in 
FIG. 22. 

For example, according to one embodiment, the method 
2200 of the present invention produces a representation of a 
true runtime object graph from a computer program module. 
The method 2200 includes creating 2210 an abstract graph 
from the computer program module and from containment 
information corresponding to the computer program module, 
wherein the abstract graph has nodes including types and 
objects, and wherein the abstract graph relates an object to a 
type, and wherein for a specific object the abstract graph 
relates the specific object to a type containing the specific 
object. The method 2200 also includes creating a runtime 
graph 2212 from the abstract graph 2210, wherein the runtime 
graph 2212 is a representation of the true runtime object 
graph, wherein the runtime graph 2212 represents contain- 
ment information such that, for a specific object, the runtime 
graph 2212 relates the specific object to another object that 
contains the specific object. 

Many other variation and modifications are possible with 
the present invention. For example, another embodiment of 
the present invention includes the further step of creating the 
display graph 2214 that is a hierarchical representation of the 
runtime graph 2212. 

According to another embodiment of the method 2200, 
creating the runtime graph 2212 includes creating a compo- 
nent-and-connector architectural view of the runtime graph 
wherein at least one object is represented as at least one 
component, and wherein relations between the plurality of 
objects are represented as at least one connector. 

Another embodiment of the method 2200, creating the 
runtime graph 2212 includes producing at least one contain- 
ment relationship between a nmtime object A and a runtime 
object B when a type in the abstract graph corresponding to 
runtime object A contains an object in the abstract graph 2210 
corresponding to runtime object B. 

In another embodiment of the method 2200, the contain- 
ment information in the abstract graph 2210 specifies at least 
one owner-as-dominator relation between two objects in the 
true runtime object graph. 

In another embodiment of the method 2200, the contain- 
ment information in the abstract graph 2210 specifies at least 
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one logical containment relation between two objects in the 
true runtime object graph, without prohibiting access to the 
object that is contained. 

In another embodiment of the method, the containment 
5 information denotes, for a variable in the computer program, 
a relationship between the an object referred to by that vari- 
able and the object referred to by the “this” variable. 

The method 2200 according to the present invention may 
also include obtaining the containment information at com- 
10 pile-time without executing the computer program module. 

In another embodiment of the method 2200, the contain- 
ment information is not executed when the computer program 
module runs. For example, one embodiment of the invention 
j - include storing the containment information, wherein storing 
the containment information is selected from a group com- 
prising storing the containment information as annotations 
inside the computer program module, such that the annota- 
tions are not executed when the computer program module 
20 runs, and storing the containment information externally to 
the computer program module. 

In another embodiment the method 2200 further includes 
domains that represent conceptual groups of objects. 

In another embodiment, the abstract graph 2210 includes 
25 edges between the domains. 

In another embodiment, the runtime graph 2212 includes 
edges that represent relations between objects. 

In another embodiment, the runtime graph 2212 further 
includes domains that represent conceptual groups of objects. 
30 In that embodiment, for example, the runtime graph 2212 
may further comprises edges between the domains. 

In embodiments including the step of creating a display 
graph 2214, the display graph may be a depth-limited projec- 
tion of the runtime graph 2212. In other embodiments, the 
35 display graph 2214 may not be depth limited. Furthermore, in 
some embodiments the display graph 2214 may further con- 
tain, for two objects A and B that are related in a further 
unfolding of the runtime graph 2212, an edge between objects 
A' and B', wherein A' is the object from which A is unfolded 
40 and B' is the object from which B is unfolded. 

In embodiments creating a component and connector 
architectural view, the component-and-connector view may 
further include a nested sub-architecture for a component 
based on the containment information of the group of objects 
45 that correspond to said component. In addition, or alterna- 
tively, the component-and-connector view may further 
include domains of components into runtime tiers based on 
the domains of objects that correspond to the components. 

In another embodiment, creating the runtime graph 2212 
50 includes merging at least two objects from the abstract graph 
2210 into a single object in the runtime graph 2212. Further- 
more, the merged objects may have the same containing 
object in the abstract graph 2210. Alternatively, or in addition, 
the nodes in the runtime graph 2212 may further comprise 
55 domains of nodes and the merged objects may be part of the 
same domain. Also, the type of one of the merged objects may 
be a subtype of the type of the other merged object. In some 
embodiments, merging at least two objects from the abstract 
graph includes merging at least two objects that share at least 
60 one least upper bound type. Furthermore, merging at least two 
objects from the abstract graph may include merging at least 
two objects that share at least one least upper bound type that 
is not in a user-specified list of trivial types. In another 
embodiment, the method 2200 may also include marking the 
65 merged object as having an intersection type that includes all 
of the least upper bound types of said objects. In another 
embodiment, merging at least two objects from the abstract 
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graph includes merging at least two objects that have at least 
one type that maps to the same user-specified design intent 
type. 

In another embodiment, the nodes in the abstract graph 
include domains, and creating an abstract graph includes 
creating at least one object that is located in a domain that is 
declared as a formal parameter of a type, and creating the 
runtime graph includes copying the object declared inside the 
formal domain in the abstract graph into at least one domain 
in the runtime graph to which the formal domain of the object 
is transitively bound. 

In another embodiment, the runtime graph 2212 represents 
objects and relations between objects that exist at runtime. 

In another embodiment there exists, for a true runtime 
object graph of the computer program, a map from a true 
runtime object to an object in the runtime graph, such that a 
true runtime object is mapped to exactly one object in the 
runtime graph. 

In another embodiment, there exists, for a true runtime 
object graph of the computer program, a map from a true 
runtime relation between objects ol and object o2 in the true 
runtime object graph, and there is a relation between the 
runtime objects corresponding to said objects ol ando2 in the 
runtime graph. 

FIG. 24 illustrates another embodiment of a system accord- 
ing to the present invention. Architectural intent is captured 
using annotations. If the annotations are added manually, a 
tool checks the annotations and the code for consistency. An 
inference tool can also help produce the annotations auto- 
matically or semi-automatically. An extraction tool uses the 
code and the annotations to extract an Ownership Object 
Graph. Another tool generates a Component-and-Connector 
Architecture from the Ownership Object Graph. Various soft- 
ware assurance techniques can be applied to the Ownership 
Object Graph or the Component-and-Connector Architec- 
ture. For instance, they can be used in architectural reviews. 
Various architectural-level analyses can analyze runtime 
quality attributes such as performance, dynamic coupling, 
security, or reliability. Another analysis can check and mea- 
sure the structural conformance of the extracted C&C view, 
which represents the as-built actual architecture, with an as- 
designed target architecture. 

Those and other variations of the methods according to the 
present invention are possible. 

7 Related Work 

Architectural Recovery. 

There is a large body of research on architectural recovery 
or architectural extraction. Most approaches use a mix of 
dynamic and static information such as naming conventions 
and, directory structures [T. Richner and S. Ducasse. Recov- 
ering High-Level Views of Object-Oriented Applications 
from Static and Dynamic Information. In ICSM, 1999.]. The 
extractors often play detective and use trial and error with 
clustering algorithms [R. Kazman and S. J. Carriere. Playing 
Detective: Reconstructing Software Architecture from Avail- 
able Evidence. Automated Softw. Eng., 6(2), 1999.]. Even so, 
existing compile-time approaches mostly obtain abstracted 
module architectures [I. T. Bowman, R. C. Holt, and N. V. 
Brewster. Linux as a Case Study: its Extracted Software 
Architecture. InICSE, 1999.], not runtime architectures. 

In many of these tools, the abstraction mechanism is hard- 
coded in the tool and cannot be controlled with user-specified 
annotations. 

Some of these clustering methods are complementary to 
this method and may help in the process of adding the anno- 
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tations to a computer program module. For instance, if a 
clustering method derives how classes in a package interact 
with other classes in the another package, this may suggest 
creating two top-level domains corresponding roughly to the 
5 two packages. A small cluster that interacts with almost all 
other packages may indicate a possible library or utility pack- 
age, and often times, objects in such a package are considered 
shared. 

Mapping Source to High-Level Models, 
to Murphy el al. produce a mapping of a source to a high-level 
model using the Reflexion Models (RM) approach [G. C. 
Murphy, D. Notkin, and K. J. Sullivan. Software Reflexion 
Models: Bridging the Gap between Design and Implementa- 
tion. IEEE TSE, 27(4), 2001.]. In RM, the developer assigns 
15 component families to classes using an external file. Then the 
tool checks the relationships between these components and 
reports any differences to the user. There are several impor- 
tant differences with RM. 

First, the object-oriented version of the RM method (em- 
20 bodied in the jRMTool) maps classes to components. Such a 
mapping is not appropriate for an object-oriented runtime 
architecture and is more suitable for the code architecture. A 
runtime architecture models runtime entities and their poten- 
tial interactions. Thus, in an object-oriented system, a com- 
25 ponent is one or more objects. More specifically, RM cannot 
map the same code entity to multiple design elements, 
depending on the context of where they were used. A runtime 
view of an object-oriented system may distinguish between 
two instances of the same class in two different contexts. For 
30 instance, a dataAccess component may comiect to a set- 
lingsDB component to read trusted configuration settings, 
and a dataDB component to access untrusted user data. A 
security analysis that operates on that runtime view may 
assign a HightrustLevel for settingsDB and a Low trustLevel 
35 for dataDB. In contrast, a module view would show one 
element, assuming that the components are implemented as 
two instances of the same java.io. File class. RM can only map 
the java.io.File class to a single node in the high-level model. 

Second, RM does not extract a complete abstraction to 
40 avoid obtaining a model that developers do not recognize. In 
our method, the OOG represents a complete model, but devel- 
oper-specified annotations help obtain meaningful abstrac- 
tions. 

Finally, RM uses non -hierarchical high-level models and 
45 maps, whereas our method produces hierarchical representa- 
tions . A developer writing the map manually must ensure that 
a type and its subtypes are mapped to the same entity in the 
high-level model. When mapping field or local variables, the 
developermust also ensure that all objects that may be aliased 
50 are mapped to the same high-level entity. In contrast, in our 
method, a type system checks that the annotations are con- 
sistent, and that the code is consistent with the amiotations. 
And the construction of the runtime graph handles aliasing 
and inheritance. Producing the mapping file in the RM 
55 approach appears more straightforward than adding owner- 
ship amiotations, but it is not amenable to type inference. The 
more sophisticated source abstraction method is needed to 
handle the runtime architectures of object-oriented systems 
soundly, in the presence of inheritance and aliasing. 

60 Dynamic Analyses. 

There are several dynamic analyses for visualizing runtime 
structures [B. Schmerl, J. Aldrich, D. Garlan, R. Kazman, and 
H. Yan. Discovering Architectures from Running Systems. 
IEEE TSE, 32(7), 2006.], [C. Flanagan and S. N. Freund. 
65 Dynamic Architecture Extraction. In FLoC FATES-RV, 
2006.]. As mentioned earlier, a static analysis is often pre- 
ferred to a dynamic analysis. First, runtime heap information 
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does not convey design intent. Second, a dynamic analysis 
may not be repeatable, i.e., changing the inputs or executing 
different use cases might produce different results. Compared 
to dynamic ownership analyses I which are descriptive and 
show the ownership structure in a single run of a program, the 
OOG obtained at compile time is prescriptive and shows 
ownership relations that will be invariant over all program 
runs. Third, a dynamic analysis cannot be used on an incom- 
plete program still under development or to analyze a frame- 
work separately from a specific instantiation. Finally, some 
dynamic analyses carry a significant runtime overhead — a 
10x-50x slowdown in one case [C. Flanagan and S. N. Fre- 
und. Dynamic Architecture Extraction. In FLoC FATES-RV, 
2006.], which must be incurred each time the analysis is run, 
whereas the main cost of adding annotations is incurred once. 

A closely related dynamic analysis, DiscoTect [B. Sch- 
merl, J. Aldrich, D. Garlan, R. Kazman, and H. Yan. Discov- 
ering Architectures from Running Systems. IEEE TSE, 32(7), 
2006 .] , recovers a non-hierarchical C&C view from a running 
program, one that shows one component for each instance 
created at runtime. Such views must be manually post-pro- 
cessed to consolidate multiple components into one. The 
present invention described an automated analysis that can 
convert an OOG into a C&C view that is hierarchical and does 
not require manual post-processing. 

Visualization. 

Software visualization research shows different aspects of 
the execution structure of a running program [M. Sefika, A. 
Sane, and R. Campbell. Architecture Oriented Visualization. 
In OOPSLA, 1996.], [W. De Pauw, E. Jensen, N. Mitchell, G. 
Sevitsky, J. M. Vlissides, and J. Yang. Visualizing the Execu- 
tion of Java Programs. In Software Visualization, 2002.]. 
Compared to our approach, dynamic visualization 
approaches do not require source code annotations and allow 
more fine-grained user interaction in producing abstractions. 
But these analyses often work at the granularity of an object 
or a class and produce task-specific views. The present inven- 
tion offers advantages such as having developer-specified 
ownership annotations drive a sound static extraction of a 
system’s runtime architecture. 

Dynamic Ownership Analyses. 

More closely related are dynamic analyses that infer the 
runtime ownership structures. These techniques do not 
require program annotations but assume a strict owner-as- 
dominator model which cannot represent many design idi- 
oms. 

Rayside et al. produce matrix displays of the ownership 
structure [D. Rayside, L. Mendel, and D. Jackson. A Dynamic 
Analysis for Revealing Object Ownership and Sharing. In 
Workshop on Dynamic Analysis ( WODA ), 2006. 31]. Simi- 
larly, Mitchell uses lightweight ownership inference to exam- 
ine a single heap snapshot rather than the entire program 
execution, and scales the approach to large programs through 
extensive graph transformation and summarization [N. 
Mitchell. The Runtime Structure of Object Ownership. In 
ECOOP, 2006.]. Noble, Potter, Potanin et al. showed both 
matrix and graph views of ownership structures and demon- 
strated that ownership is effective at organizing runtime 
object structures F. Hill, J. Noble, and J. Potter. Scalable 
Visualizations of Object-Oriented Systems with Owners hip 
Trees. J. Visual Languages and Computing, 13(3), 2002.1, 
[A. Potanin, J. Noble, and R. Biddle. Checking Ownership 
and Confinement. Concurrency and Computation: Practice 
and Experience, 16(7), 2004.]. We use the same key insight 
but in a static analysis that must address additional chal- 
lenges. 


Object Graph Analyses. 

Several static analyses produce non-hierarchical object 
graphs without using annotations. Pangaea [A. Spiegel. Auto- 
matic Distribution of Object-Oriented Programs. PhD thesis, 
5 FU Berlin, 2002.] produces a flat object graph without an alias 
analysis and is unsound (the Pangaea output for JHotDraw is 
even more complex than FIG. 2). WOMBLE [ ] D. Jackson 
and A. Waingold. Lightweight Extraction of Object Models 
from Bytecode. TSE, 27(2), 2001.] uses syntactic heuristics 
to and abstraction rules for container classes to obtain an object 
model including multiplicities. The Womble analysis is 
unsound and aliasing-unaware by design. Ajax [R. W. 
O’Callahan. Generalized Aliasing as a Basis for Program 
Analysis Tools. PhD thesis, CMU, 2001 .] uses an alias analy- 
15 sis to build a refined object model as a conservative static 
approximation of the heap graph reachable from a given set of 
root objects. However, Ajax does not use ownership and 
produces flat object graphs. Its output was manually post- 
processed to remove “lumps” with more than seven incoming 
20 edges [R. W. O’Callahan. Generalized Aliasing as a Basis for 
Program Analysis Tools. PhD thesis, CMU, 2001, p. 248]. In 
our approach, we often suppress shared objects and their 
associated edges since they often add needless clutter. Even 
though excluding shared objects makes the resulting diagram 
25 unsound, the use of the shared annotation is entirely underthe 
control of the developer adding the annotations. A developer 
can easily avoid the shared annotation if she is interested in 
reasoning about all objects in the system: shared is an escape 
hatch mainly designed to easily interoperate with legacy code 
30 or third-party libraries [J. Aldrich, V. Kostadinov, and C. 
Chambers. Alias Annotations for Program Understanding. In 
OOPSL4, 2002.]. Finally, Ajax’s heavyweight but precise 
alias analysis does not scale to large programs. Flat objects 
graphs do not provide architectural abstraction and do not 
35 scale, because the number of top-level objects in the archi- 
tecture increases with the program size. 

Lam and Rinard [P. Lam and M. Rinard. A Type System 
and Analysis for the Automatic Extraction and Enforcement 
of Design Information. In ECOOP, 2003.] proposed a type 
40 system and a static analysis (which we refer to here as LR) 
whereby developer-specified annotations guide the static 
abstraction of an object model by merging objects based on 
tokens. LR supports two kinds of tokens. The first kind, token 
parameters, are a loose adaptation of ownership type param- 
45 eters that predate them [D. Clarke, J. Potter, and J. Noble. 
Ownership Types for Flexible Alias Protection. In OOPSLA, 
1 998.], and correspond roughly to domain parameters. Com- 
pared to Reflexion Models, LR can map a a single code 
element to multiple design elements. However, token param- 
50 eters lack semantics. For instance, they do not give any pre- 
cision about aliasing. The second kind, global static tokens 
correspond loosely to top-level domains. In LR, each token 
parameter C<p 1 > is bound to another B<p2>, and transitively 
to a global token. In ownership domains, a class C can declare 
55 a private or a public domain D. Each instance of C gets a fresh 
instance of D, so obj 1 .D^obj 2 .D for fresh objj and obj 2 . In 
addition to binding to another domain parameter B<d 2 >, a 
domain parameter C<d> can bind a locally declared domain, 
e.g., objB:D. These local domains create the OOG hierarchy. 
60 As a result, an analysis based on LR can only extract non- 
hierarchical representations of object models. Recall in our 
evaluation of Aphyds, we used private domains in Round 1 — 
note the (+) sign on most objects for the elided substructure in 
FIG. 20(a). To get the OOG in FIG. 20 (b), one that is com- 
65 parable to the as-designed architecture in FIG. 1, we used 
public domains in Round 2, thus confi rmin g that hierarchy is 
indispensable. Lam and Rinard do not mention inheritance in 
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their paper and their formal system omits inheritance entirely 
[P. Lam and M. Rinard. A Type System and Analysis for the 
Automatic Extraction and Enforcement of Design Informa- 
tion. In ECOOP, 2003, FIG. 10]. The description of the 
present invention discussed the challenges that multiple inter- 5 
face inheritance introduces and how our method addresses 
them. Lam and Rinard gave no soundness proof of the under- 
lying type system or of the extracted object model. Finally, 
the LR system was evaluated on one 1.7 KLOC system, 
whereas we evaluated the OOG extraction on several systems to 
of 8-16 KLOC each. The LR type system is only descriptive, 
despite the use of the word “enforcement” in the title, and 
does not enforce a tiered architecture in code like the owner- 
ship domains type system [ J. Aldrich and C. Chambers. Own- 
ership Domains: Separating Aliasing Policy from Mecha- 15 
nism. In ECOOP, 2004.], [M. Abi-Antoun and J. Aldrich. 
Ownership Domains in the Real World. In IWACO, 2007.]. 
Our approach does not require special annotations just to 
extract a design [P. Lam and M. Rinard. A Type System and 
Analysis for the Automatic Extraction and Enforcement of 20 
Design Information. In ECOOP, 2003.] but leverages well- 
researched ownership types [D. Clarke, J. Potter, and J. 
Noble. Ownership Types for Flexible Alias Protection. In 
OOPSLA, 1998.], [J. Aldrich and C. Chambers. Ownership 
Domains: Separating Aliasing Policy from Mechanism. In 25 
ECOOP, 2004.] that also have uses in program verification [Y. 

Lu, J. Potter, and J. Xue. Validity Invariants and Effects. In 
ECOOP, 2007.]. Finally, work on ownership inference could 
reduce the annotation cost [J. Aldrich, V. Kostadinov, and C. 
Chambers. Alias Annotations for Program Understanding. In 30 
OOPSLA, 2002.], [Y. Liu and A. Milanova. Ownership and 
Immutability Inference for UML-based Object Access Con- 
trol. In ICSE, 2007.], [K.-K. Ma and J. S. Foster. Inferring 
Aliasing and Encapsulation Properties for Java. In OOPSLA, 
2007.]. 35 

8 Conclusion 

We proposed a novel approach to statically extract a sound 
hierarchical runtime architecture from object-oriented pro- 40 
grams, written in existing languages, using existing libraries 
and general design idioms. The approach relies on ownership 
domain annotations to specify and enforce in code the archi- 
tectural intent related to object encapsulation and communi- 
cation. 45 

We evaluated the approach using several real medium- 
sized programs. From an annotated program, a tool can 
quickly extract a runtime architecture that conveys meaning- 
ful abstractions and gives various insights by identifying 
undocumented information or contradicting manual docu- 50 
mentation. 

Those and other variations and modifications of the present 
invention are possible and contemplated, and it is intended 
that the foregoing specification and the following claims 
cover such modifications and variations. 55 

The invention claimed is: 

1. A method for producing a representation of a true runt- 
ime object graph from a static computer program module, 
comprising: 

creating an abstract graph from the static computer pro- 60 
gram module and from containment information corre- 
sponding to the static computer program module, 
wherein the abstract graph has nodes including types, 
domains and objects wherein the abstract graph relates 
an object to a type, wherein the abstract graph relates a 65 
domain to a type containing the domain, wherein the 
abstract graph relates an object to a domain containing 
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the object, and wherein creating an abstract graph 
includes creating at least one object that is located in a 
domain that is declared as a formal parameter of a type; 

creating a runtime graph from the abstract graph, wherein 
the runtime graph is an approximation of the true runt- 
ime object graph, wherein the runtime graph instantiates 
the objects from the abstract graph and represents con- 
tainment information such that each runtime object con- 
tains runtime domains and each runtime domain con- 
tains runtime objects, and wherein creating the runtime 
graph includes copying the object declared inside the 
formal domain in the abstract graph into at least one 
domain in the runtime graph to which the formal domain 
of the object is transitively bound. 

2. The method of claim 1, further comprising creating a 
display graph that is a hierarchical representation of the runt- 
ime graph. 

3. The method of claim 1, further comprising creating a 
component-and-comiector architectural view of the runtime 
graph wherein at least one object is represented as at least one 
component, and wherein relations between the plurality of 
objects are represented as at least one connector. 

4. The method of claim 1, further comprising producing at 
least one containment relationship in the runtime graph 
between a runtime object A and a runtime object B when a 
type in the abstract graph corresponding to runtime object A 
contains an object in the abstract graph corresponding to 
runtime object B. 

5. The method of claim 1, wherein the containment infor- 
mation specifies at least one owner-as-dominator relation 
between two objects in the true runtime object graph. 

6. The method of claim 1, wherein the containment infor- 
mation specifies at least one logical containment relation 
between two objects in the true runtime object graph, without 
prohibiting access to the object that is contained. 

7. The method of claim 1, wherein the containment infor- 
mation denotes, for a variable in the computer program, a 
relationship between the object referred to by that variable 
and the object referred to by the “this” variable. 

8. The method of claim 1, further comprising, prior to 
creating the abstract graph, a step selected from a group 
comprising: storing the containment information as annota- 
tions inside the computer program module, such that the 
annotations are not executed when the computer program 
module runs; storing the containment information externally 
to the computer program module; and retrieving the contain- 
ment information at compile-time from a source external to 
the computer program module and without executing the 
computer program module. 

9. The method of claim 1, wherein domains represent con- 
ceptual groups of objects. 

10. The method of claim 9, wherein the abstract graph 
includes edges between the domains. 

11. The method of claim 1, wherein the runtime graph 
includes edges that represent relations between objects that 
will exist at runtime. 

12. The method of claim 1, wherein the runtime graph 
further comprises edges between the domains. 

13. The method of claim 2, wherein the display graph is a 
depth-limited projection of the runtime graph. 

14. The method of claim 2, wherein the display graph 
further contains, for two objects A and B that are related in a 
further unfolding of the runtime graph, an edge between 
objects A' and B', wherein A' is the object from which A is 
unfolded and B' is the object from which B is unfolded. 

15. The method of claim 3, wherein the component-and- 
comiector view further includes a nested sub -architecture for 
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a component based on the containment information of the 
group of objects that correspond to said component. 

16. The method of claim 3, wherein the component-and- 
comiector view further includes runtime tiers of components 
based on the domains of objects that correspond to the com- 5 
ponents. 

17. The method of claim 1, wherein creating the runtime 
graph includes merging at least two objects from the abstract 
graph into a single object in the runtime graph. 

18. The method of claim 17, wherein merging at least two to 
objects is selected from a group comprising: merging at least 
two objects that have a common containing object in the 
abstract graph; and merging at least two objects that are part 

of the same domain; and merging at least two objects where 
one object has a type that is a subtype of a type of the other 15 
merged object. 

19. The method of claim 17, wherein merging at least two 
objects from the abstract graph includes merging at least two 
objects that share at least one least upper bound type that is 
not in a list of types selected from a group comprising: empty; 20 
having user-specified values; and having default values. 

20. The method of claim 17, wherein merging at least two 
objects from the abstract graph includes merging at least two 
objects, such that each of the two objects has at least one type, 
but they both map to the same user-specified design intent 25 
type. 

21. An apparatus, comprising: 

a processor; 

memory including computer readable instructions which, 
when executed by the processor, cause the processor to: 30 

create an abstract graph from the static computer program 
module and from containment information correspond- 
ing to the static computer program module, wherein the 
abstract graph has nodes including types, domains and 
objects wherein the abstract graph relates an object to a 35 
type, wherein the abstract graph relates a domain to a 
type containing the domain, wherein the abstract graph 
relates an object to a domain containing the object, and 
wherein creating an abstract graph includes creating at 
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least one object that is located in a domain that is 
declared as a formal parameter of a type; 
create a runtime graph from the abstract graph, wherein the 
runtime graph is an approximation of the true runtime 
object graph, wherein the runtime graph instantiates the 
objects from the abstract graph and represents contain- 
ment information such that each runtime object contains 
runtime domains and each runtime domain contains 
runtime objects, and wherein creating the runtime graph 
includes copying the object declared inside the formal 
domain in the abstract graph into at least one domain in 
the runtime graph to which the formal domain of the 
object is transitively bound. 

22. Computer readable instructions, stored on a non-tran- 
sitory medium, which, when executed, cause a processor to: 
create an abstract graph from the static computer program 
module and from containment information correspond- 
ing to the static computer program module, wherein the 
abstract graph has nodes including types, domains and 
objects wherein the abstract graph relates an object to a 
type, wherein the abstract graph relates a domain to a 
type containing the domain, wherein the abstract graph 
relates an object to a domain containing the object, and 
wherein creating an abstract graph includes creating at 
least one object that is located in a domain that is 
declared as a formal parameter of a type; 
create a runtime graph from the abstract graph, wherein the 
runtime graph is an approximation of the true runtime 
object graph, wherein the runtime graph instantiates the 
objects from the abstract graph and represents contain- 
ment information such that each runtime object contains 
runtime domains and each runtime domain contains 
runtime objects, and wherein creating the runtime graph 
includes copying the object declared inside the formal 
domain in the abstract graph into at least one domain in 
the runtime graph to which the formal domain of the 
object is transitively bound. 



